Reviewing and corecting documentation
parent
d5d4df7874
commit
9fdd18faae
|
|
@ -1,13 +1,13 @@
|
|||
# Documentation
|
||||
|
||||
**martiLQ** stands for metadata reconcilation for transfer information, load quality
|
||||
**martiLQ** stands for metadata reconciliation for transfer information, load quality
|
||||
|
||||
Before starting with **martiLQ** it is advisable to understand if it is right for
|
||||
your organisation's needs. Information is available in a number of short
|
||||
your organization's needs. Information is available in a number of short
|
||||
documents.
|
||||
|
||||
There is no quick start document to get you started as each use case and
|
||||
organisation is different. There are sample implementations which you
|
||||
organization is different. There are sample implementations which you
|
||||
can adjust if they resonate with your circumstances,
|
||||
see [sample implementations](samples/)
|
||||
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
# Acknowledgment
|
||||
|
||||
Once the **martiLQ** document is received by a consumer then communicating the receipt, processing,
|
||||
success or failure completes the feedback loop and builds an extra layer of assurance for the organisation.
|
||||
success or failure completes the feedback loop and builds an extra layer of assurance for the organization.
|
||||
|
||||
The acknowledgement workflow provides the necessary feedback. If an acknowledgement is required as part of the
|
||||
consumption design then the following is approach is recommended.
|
||||
|
|
@ -9,41 +9,55 @@ consumption design then the following is approach is recommended.
|
|||
1. The publisher provides callback details. For extra security the callback details should be signed.
|
||||
2. The consumer will acknowledge the receipt of the **martiLQ** document by sending back the same
|
||||
document to the publisher with some values changed.
|
||||
3. Change the root consumer and state (not resource) from ``active`` to ``receipt``.
|
||||
3. Change the root consumer and state (not resource) from ``active`` to ``receipt`` and modify the ``stateModified`` timestamp.
|
||||
4. Change the ``consumer`` data value to only be your identifier and not others, so that the publisher
|
||||
can identify the consumer and associate it with success or failure. This change to consumer value
|
||||
applies to all subsequent acknowledgement messages.
|
||||
5. Send the changed **martiLQ** document back using the callback details
|
||||
6. On fetching each resource the resource state is changed from ``active`` to ``received``. If any resource
|
||||
cannot be retrieved the state is changed from ``active`` to ``missing``.
|
||||
6. On fetching each resource the resource state is changed from ``active`` to ``received`` and modify the ``stateModified`` timestamp.
|
||||
If any resource cannot be retrieved the state is changed from ``active`` to ``missing`` and ``stateModified`` timestamp is udpated.
|
||||
7. The consumer can elect to send back the **martiLQ** document to the publisher on each fetch or at the completion
|
||||
of all fetches. The recommendation is to send at the end of all fetches because if there are issues then
|
||||
having all the failures for analysis should assist in determining the extent of the failure.
|
||||
8. Once all resources are fetched (or failed), the root state is changed from ``receipt`` to ``received`` if no
|
||||
errors occurred in retrieving the resources. If a single or many errors occurred, then the root state is
|
||||
changed from ``receipt`` to ``missing``. The updated document is sent back to the publisher using
|
||||
the callback details.
|
||||
8. Once all resources are fetched (or failed), the root state is changed from ``receipt`` to ``received`` and update the ``stateModified``
|
||||
timestamp if no errors occurred in retrieving the resources. If a single or many errors occurred, then the root state is
|
||||
changed from ``receipt`` to ``missing`` and the ``stateModified`` timestamp is updated. The updated document is sent back
|
||||
to the publisher using the callback details.
|
||||
9. The next stage is to validate and process the resources defined in the **martiLQ** document. This follows
|
||||
a similar process to fetching the resources.
|
||||
10. On processing each resource the resource state is changed from ``received`` to ``processed``. If any resource
|
||||
cannot be processed the state is changed from ``received`` to ``error``. Once again this can be acknowledged
|
||||
back to the publisher.
|
||||
11. Once all resources are processed (or failed), the root state is changed from ``received`` to ``processed`` if no
|
||||
errors occurred in processing the resources. If a single or many errors occurred, then the root state is
|
||||
changed from ``received`` to ``error``. The updated document is sent back to the publisher using
|
||||
10. On processing each resource the resource state is changed from ``received`` to ``processed`` and modify the ``stateModified`` timestamp.
|
||||
If any resource cannot be processed the state is changed from ``received`` to ``error`` and update the ``stateModified`` timestamp. Once
|
||||
again this can be acknowledged back to the publisher.
|
||||
11. Once all resources are processed (or failed), the root state is changed from ``received`` to ``processed`` and the ``stateModified`` timestamp
|
||||
updated if no errors occurred in processing the resources. If a single or many errors occurred, then the root state is
|
||||
changed from ``received`` to ``error`` and the ``stateModified`` timestamp modified. The updated document is sent back to the publisher using
|
||||
the callback details.
|
||||
|
||||
This completes the acknowledgment workflow for the **martiLQ** document. The level of acknowledgement feedback
|
||||
you wish to implement as a consumer is your decision. Any publisher providing callback details for acknowledgement can also
|
||||
choose their behaviour on actions and recording any acknowledgments received.
|
||||
choose their behavior on actions and recording any acknowledgments received.
|
||||
|
||||
In the above acknowledgement process, you **must not** change the identifiers in the **martiLQ** document and you **should not**
|
||||
change other data except the ``consumer`` and ``state`` and ``stateModified``.
|
||||
|
||||
If you are the publisher and expect acknowledgment then there is an extra scenario you need to cater for. The scenario is
|
||||
that you do not recieve any acknowledgement back from the expected consumer(s) within the agreed timeframe. In this situation
|
||||
that you do not receive any acknowledgement back from the expected consumer(s) within the agreed time frame. In this situation
|
||||
the publisher will need to know each consumer and their service level agreements.
|
||||
|
||||
## Callback
|
||||
|
||||
The callback method can take any form. The normal expectation is that the same method used to communicate the **martiLQ**
|
||||
document is used for the callback for example:
|
||||
|
||||
* If the **martiLQ** document is originally sent by Kafka queue then the callback should use a Kafka queue separate topic
|
||||
* If the **martiLQ** document is originally sent by REST API then the callback is a publisher REST API end point
|
||||
* If the **martiLQ** document is originally sent by email then the callback uses a reply address for the callback
|
||||
|
||||
While the above is the expected convention, you can mix and match. For example:
|
||||
|
||||
* If the **martiLQ** document is originally sent by REST API then the callback can be an email address. This
|
||||
situation could be acceptable if the publisher does not have the ability to accept REST API call backs.
|
||||
|
||||
## Compressed file handling
|
||||
|
||||
When the **martiLQ** document is defining a parent compressed file, e.g. ZIP or 7Z, then the resources are expected
|
||||
|
|
@ -52,7 +66,3 @@ state of the resource is still changed to reflect the processing.
|
|||
|
||||
If the file cannot be extracted either because it has not been included or there is a decompression error, then the
|
||||
same acknowledgement process of using the state is used.
|
||||
|
||||
## Error situations
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -1,11 +1,10 @@
|
|||
Comparison of martiLQ document definition
|
||||
=========================================
|
||||
# Comparison of martiLQ document definition
|
||||
|
||||
The use of metadata definitions is not unique and examples
|
||||
exist in many different situations. Some are standard and open
|
||||
while others are closed.
|
||||
|
||||
Some open standards are EXIF data for pictures, SQL DDL defintions
|
||||
Some open standards are EXIF data for pictures, SQL DDL definitions
|
||||
for databases, the XMP definition and web header responses before the
|
||||
web content.
|
||||
|
||||
|
|
@ -23,18 +22,16 @@ a CKAN source into a **martiLQ** document definition and then process the data
|
|||
through the pipeline as you would for any other data file that had a
|
||||
**martiLQ** document definition.
|
||||
|
||||
Benefit of CKAN and martiLQ
|
||||
---------------------------
|
||||
## Benefit of CKAN and martiLQ
|
||||
|
||||
The CKAN is excellent at defining the data source details but it lacks information
|
||||
for load quality. If you have CKAN deployed in your organisation and wish
|
||||
for load quality. If you have CKAN deployed in your organization and wish
|
||||
exhange or process the data referenced in CKAN, then there are synergies between
|
||||
CKAN and marti.
|
||||
|
||||
Samples exist on CKAN integration.
|
||||
|
||||
Magda and martiLQ
|
||||
-----------------
|
||||
## Magda and martiLQ
|
||||
|
||||
Another source of data is [Magda](https://magda.io/) which has API metadata
|
||||
definitions. Magda is more about data federation and as such provides
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
Custom Definition
|
||||
=================
|
||||
|
||||
The custome definition section allows the inclusion of extensions
|
||||
The custom definition section allows the inclusion of extensions
|
||||
to the standard. To demonstrate the inclusion, there are three
|
||||
sample extensions. These are:
|
||||
|
||||
|
|
|
|||
|
|
@ -1,5 +1,5 @@
|
|||
Magda definitions
|
||||
=================
|
||||
# Magda definitions
|
||||
|
||||
|
||||
https://magda.io/
|
||||
|
||||
|
|
|
|||
|
|
@ -39,7 +39,7 @@ modified|Modified date and time of the **martiLQ** document|Now
|
|||
tags|List of tags or keywords|
|
||||
publisher|Publisher name|
|
||||
contactPoint|Contact point of a person or team|
|
||||
accessLevel|Acces level|
|
||||
accessLevel|Access level|
|
||||
rights|Rights|
|
||||
|
||||
* Batch
|
||||
|
|
@ -54,7 +54,7 @@ rights|Rights|
|
|||
### Information extension
|
||||
|
||||
The information supplied can be extended by party agreement and there
|
||||
are place holders in the defintion.
|
||||
are place holders in the definition.
|
||||
|
||||
## Resource
|
||||
|
||||
|
|
@ -62,7 +62,7 @@ The resource section is a list of documents or files that are to be grouped
|
|||
together are listed under the same **martiLQ** definition.
|
||||
|
||||
At least one document or file must be included. If the same resource is repeated
|
||||
it will commonly be for definiting multiple formats, with each file having a
|
||||
it will commonly be defining multiple formats, with each file having a
|
||||
different extension. Commonly the definition includes at least the following
|
||||
items:
|
||||
|
||||
|
|
@ -84,7 +84,7 @@ for more details
|
|||
Name|Description|Default or values
|
||||
---|---|--
|
||||
hash|Hash of document - The hash of the document, which can be blank especially for large documents
|
||||
algo|Hash algorithm - Algoroithm used to generate the hash value or sign it
|
||||
algo|Hash algorithm - Algorithm used to generate the hash value or sign it
|
||||
description|Description - A more detailed description
|
||||
version|Version - A document version
|
||||
encoding|Encoding
|
||||
|
|
|
|||
|
|
@ -19,7 +19,7 @@ load quality metrics.
|
|||
* Number of records in the document - This is the number of data primary records not the
|
||||
count of end of lines and is agreed between parties. XML record counts could be based
|
||||
on the number of primary segments under root. JSON records can be counted in a similar way.
|
||||
The headers or trailling records are not counted
|
||||
The headers or trailing records are not counted
|
||||
|
||||
## Addresses deficiencies
|
||||
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
# References
|
||||
|
||||
The following are references to documents that inspired the creation of **martiLQ**
|
||||
document and associatd framework.
|
||||
document and associated framework.
|
||||
|
||||
https://dex.dss.gov.au/sites/default/files/documents/2021-06/data-exchange-protocols-june-2021.pdf
|
||||
|
||||
|
|
|
|||
|
|
@ -12,7 +12,7 @@ for various programming languages and situations. As many programming languages
|
|||
generate portable programs that can execute on multiple operating systems, the
|
||||
likelihood is that a tools exists for you.
|
||||
|
||||
The source for tools is provided in the Github repository and some have precompiled
|
||||
The source for tools is provided in the Github repository and some have pre-compiled
|
||||
images.
|
||||
|
||||
See the project source directory for more details.
|
||||
|
|
|
|||
|
|
@ -1,13 +1,11 @@
|
|||
Who is likely to use martiLQ
|
||||
============================
|
||||
# Who is likely to use martiLQ
|
||||
|
||||
You are likely to find the **martiLQ** framework relevant if you:
|
||||
|
||||
1. Have many document exchanges, such as End of Day batches
|
||||
2. Need to verify or reconcile the documents
|
||||
|
||||
Data exchanges
|
||||
--------------
|
||||
## Data exchanges
|
||||
|
||||
If you are creating or receiving many documents or files on a regular basis
|
||||
then you probably have some framework defined. The framework may be as simple as:
|
||||
|
|
@ -24,8 +22,7 @@ Simple framework such as the above have limitations, such as:
|
|||
* Lower automation prospects and alignment to DataSecOps
|
||||
* Poor fit to web applications (they tend to be designed for FTP and LAN)
|
||||
|
||||
Framework Sidecar files
|
||||
-----------------------
|
||||
## Framework Sidecar files
|
||||
|
||||
The **martiLQ** framework addresses the issues and limitations by using sidecar
|
||||
or shadow files. The [concept of sidecar files](https://en.wikipedia.org/wiki/Sidecar_file) is
|
||||
|
|
@ -35,7 +32,7 @@ Sidecar files can also be implemented as ``forks`` and built into the operating
|
|||
in Mac OS X HFS. The Microsoft NTFS supports Alternate Data Streams to achieve a similar outcome.
|
||||
Unfortunately this information is not transferrable to other systems.
|
||||
|
||||
The proposition is to define a format for the sidecare file and provide common library tools that
|
||||
The proposition is to define a format for the sidecar file and provide common library tools that
|
||||
can be be used on multiple platforms when exchanging documents / files. Multiple documents can be
|
||||
defined in a singel **martiLQ** definition which adds to efficiency and productivity if used
|
||||
defined in a single **martiLQ** definition which adds to efficiency and productivity if used
|
||||
for End of Day or similar batches - or even single file transfers.
|
||||
|
|
|
|||
115
pattern.md
115
pattern.md
|
|
@ -6,8 +6,8 @@
|
|||
and intended to be consumed by another system component with self-describing information with
|
||||
load assurance metrics.
|
||||
|
||||
The consuming system component can be at the same location, a dfifferent geographical location,
|
||||
the same organisation or another organisation.
|
||||
The consuming system component can be at the same location, a different geographical location,
|
||||
the same organization or another organization.
|
||||
|
||||
The pattern does not define the format that the data file or document must take or how the data is transferred
|
||||
or accessed. You choose the data format and transfer method. Once you have made the choice, you can describe
|
||||
|
|
@ -23,7 +23,7 @@ to demonstrate generating the **martiLQ** document.
|
|||
|
||||
## Problem statement
|
||||
|
||||
Even though event streaming is a stragetic goal for many organisations, there exists legcay processes and there
|
||||
Even though event streaming is a stragetic goal for many organizations, there exists legacy processes and there
|
||||
will continue to be a need to transfer data flies and other documents from one system to another.
|
||||
|
||||
When a handover of a data file or document occurs, the best practice is to include metrics with the transfer
|
||||
|
|
@ -31,14 +31,14 @@ to assure the recipient of provenance and quality of the data file or document.
|
|||
with the data file or document.
|
||||
|
||||
A document includes unstructered data, letters, pictures, binary objects while data files could be though of
|
||||
as strutured data that is describes multiple records.
|
||||
as structured data that is describes multiple records.
|
||||
|
||||
### Assurance Problem
|
||||
|
||||
**How does the recipient know they have received all related files, the provenance, it is immutable and
|
||||
assurance on quality?**
|
||||
|
||||
Many organisations have used the file name as the carrier of this information but this has limits.
|
||||
Many organizations have used the file name as the carrier of this information but this has limits.
|
||||
|
||||
## Efficiency Problem
|
||||
|
||||
|
|
@ -54,7 +54,7 @@ Therefore the objective is to produce a documentation standard that:
|
|||
1. provides load assurance when transferring data files and documents
|
||||
2. can be tooled and therefore achieve some level of automation
|
||||
3. is extensible to give the publisher and consumer control as to the level of assurance
|
||||
required to match the risk appetite of the organisation
|
||||
required to match the risk appetite of the organization
|
||||
|
||||
## Context
|
||||
|
||||
|
|
@ -84,34 +84,64 @@ as they are considered the minimal for best practice
|
|||
* Format, encoding, compression
|
||||
* Data record count
|
||||
|
||||
There is an acknowledgment processs that is recommended for confirmation on processing. See
|
||||
[acknowldegment](docs/source/acknowledgement.md) for approach details.
|
||||
There is an acknowledgment process that is recommended for confirmation on processing. See
|
||||
[acknowledgement](docs/source/acknowledgement.md) for approach details.
|
||||
|
||||
## Forces
|
||||
|
||||
The qualities that this pattern is addressing...
|
||||
The qualities that this pattern is addressing are:
|
||||
|
||||
1. Frees transfer from file naming convention that include magic strings that store metadata
|
||||
2. A event message based paradigm which is independent of the size and number of files
|
||||
3. Publishes basic metadata on the files and their source
|
||||
4. Secures the file transfer from tampering or corruption
|
||||
5. Allows the inclusion of quality metrics such as provenance, elements and record counts
|
||||
6. Allows the consumer to select the files to process avoiding unnecessary transfers
|
||||
7. Provides a simple acknowledgment process
|
||||
8. Is extendable
|
||||
|
||||
The file transfer pattern is the original method for separate processes to exchange data. The file being stored on magnetic tape and either
|
||||
loaded back onto the same compute resource (think mainframe) or physicaly couriered to another lcoation or tape drive. The
|
||||
loaded back onto the same compute resource (think mainframe) or physically couriered to another location or tape drive. The
|
||||
reference book [Enterprise Integration Patterns](https://www.enterpriseintegrationpatterns.com/patterns/messaging/FileTransferIntegration.html)
|
||||
by Hohpe and Woolf recognises this by inculsion of the pattern written by Martin Fowler.
|
||||
by Hohpe and Woolf recognizes this by inclusion of the pattern written by Martin Fowler.
|
||||
|
||||
This pattern addresess the issues and concerns that relate to file transfer. Many of these are related the the common
|
||||
In the explanation written by Martin Fowler, he makes observations about the "File Transfer" including:
|
||||
|
||||
"Part of what makes _File Transfer_ simple is that no extra tools or integration packages are needed, but that also means that developers
|
||||
have to do a lot of the work themselves. The applications must agree om file-naming conventions and the directories they appear. ...
|
||||
, then some application must take responsibility for transferring the file form one disk to another"
|
||||
|
||||
The pattern being described here addresses the issues and concerns that relate to file transfer. Many of these are related to the common
|
||||
non functional requirements that architects cover in solution designs.
|
||||
|
||||
### Security, robustness, reliability, fault-tolerance
|
||||
|
||||
The pattern defines how security and assurance is applied to the data files and documents. The pattern does
|
||||
not define how to setup a reliable infrastructure, but it can be used to detect failures
|
||||
in the infrastructire. The fault-tolerance allowance is up to each implementation.
|
||||
in the infrastructure. The fault-tolerance allowance is up to each implementation.
|
||||
|
||||
Fault-tolerance and the actionable task can be dialled from 0% tolerance to 100% tolerance on a
|
||||
case by case basis.
|
||||
|
||||
### Manageability
|
||||
|
||||
The pattern takes into consideration of how the file transfer is managed. It can provide a standard
|
||||
that makes file transfers easier to manage regardless of the underlying transport mechanism. As the
|
||||
same **martiLQ** document can be consumed by multiple recipients, it is easy to distribute, with
|
||||
access controls ensuring only authorized recipients can access the files that are relevant to them.
|
||||
|
||||
This reduces the need to run multiple jobs distributing the files.
|
||||
|
||||
Management is also improved if the acknowledgement capability is implemented so that the publisher
|
||||
knows which recipient has processed the file. If the recipient no longer wishes the file and stops
|
||||
processing the publisher will slowly build a time line and metrics to recognize that the file
|
||||
is no longer consumed. The publisher can then cease to produce unused files.
|
||||
|
||||
### Efficiency, performance, throughput, bandwidth requirements, space utilization
|
||||
|
||||
If the process is using event based messaging, files that are not required at the destination
|
||||
are never transferred. This saves bandwidth and storage at the destination.
|
||||
|
||||
### Scalability (incremental growth on-demand)
|
||||
|
||||
The pattern scalability is not bound to the size of the data files themselves. The pattern can
|
||||
|
|
@ -120,43 +150,64 @@ may be factor in the decision of breaking down to smaller volumes.
|
|||
|
||||
### Extensibility, evolvability, maintainability
|
||||
|
||||
The **martiLQ** document can be customised and can evolve as the market conidtions change. Versioning
|
||||
The **martiLQ** document can be customized and can evolve as the market conditions change. Versioning
|
||||
is built into the definition and consumers can select which attributes are mandatory for
|
||||
processing.
|
||||
|
||||
### Modularity, independence, re-usability, openness, composability (plug-and-play), portability
|
||||
|
||||
The **martiLQ** document is an open definition that can be used in may file transfer scenarios. You can compose
|
||||
new functionality on top of the open code.
|
||||
|
||||
The **martiLQ** document is portable as is the reference implementation. You can run the Python, PowerShell, Go code
|
||||
on both Windows and Linux platforms and on different architectures.
|
||||
|
||||
### Completeness and correctness
|
||||
|
||||
The **martiLQ** document contains metadata to ensure that all files in a job file transfer are treated
|
||||
as a package or integral unit. If files are missing then this is recognized early in the process and
|
||||
the recipient consumer can decide on whether to continue processing or halt.
|
||||
|
||||
Additional scope also exists in the **martiLQ** document to add more load quality assurance metrics, which can be
|
||||
automatically processed to ensure correctness.
|
||||
|
||||
### Ease-of-construction
|
||||
|
||||
The **martiLQ** document is a JSON formatted document, making it easy to construct using modern tools.
|
||||
|
||||
**Note**: An XML format document is in the backlog for possible implementation.
|
||||
|
||||
Using the reference implementation, the organization can implement the pattern into their current process without
|
||||
requiring extensive builds. The reference implementation has code in various programming languages and can run on
|
||||
Windows and Linux platforms.
|
||||
|
||||
All code is visible and auditable.
|
||||
|
||||
**Note**: The reference implementation is not the most efficient code in all situations and there is much room for
|
||||
improvement. The objective of the reference implementation was to demonstrate the ease of use by scanning
|
||||
file system directory or converting from another format such as CKAN.
|
||||
|
||||
### Ease-of-use
|
||||
|
||||
## Solution
|
||||
If you are comfortable with using the reference implementation, you can be generating **martiLQ** documents in short time.
|
||||
|
||||
|
||||
|
||||
A description, using text and/or graphics, of how to achieve the intended goals and objectives. The description should identify both the solution's static structure and its dynamic behavior - the people and computing actors, and their collaborations. The description may include guidelines for implementing the solution. Variants or specializations of the solution may also be described.
|
||||
Download the git repository code, review the samples and adjust to scan your directory to generate the **martiLQ** document.
|
||||
As a simplistic approach you can execute the code in your pipeline after you have created existing files.
|
||||
|
||||
## Resulting Context
|
||||
|
||||
After applying the pattern consistently on file transfers within the organization, the expectation is that you will spend less
|
||||
time discussing and building the mechanics of file transfer including the polling, monitoring and load assurance. A large portion
|
||||
which will have been for you as part of the **martiLQ** document and its implementation.
|
||||
|
||||
There will also less documentation to review and discuss as the **martiLQ** document will provide standards that developers can
|
||||
follow such as the encoding and format. This applies to both structured and unstructured data.
|
||||
|
||||
The post-conditions after the pattern has been applied. Implementing the solution normally requires trade-offs among competing forces.
|
||||
This element describes which forces have been resolved and how, and which remain unresolved. It may also indicate other patterns that may be applicable in the new context. (A pattern may be one step in accomplishing some larger goal.) Any such other patterns will be described in detail under Related Patterns.
|
||||
You will still need to decide fo structured data on how many files, the data columns and records to be included in each file.
|
||||
|
||||
For unstructured data, such as bundling like documents together, the process is much simpler if you take the approach to create
|
||||
a folder containing the files and then execute the routine to compress and package all together.
|
||||
|
||||
## Examples
|
||||
|
||||
Please refer to the [documentation](docs/source/README.md) and [samples](docs/source/samples/README.md)
|
||||
|
||||
## Rationale
|
||||
|
||||
An explanation/justification of the pattern as a whole, or of individual components within it, indicating how the pattern actually works, and why - how it resolves the forces to achieve the desired goals and objectives, and why this is "good". The Solution element of a pattern describes the external structure and behavior of the solution: the Rationale provides insight into its internal workings.
|
||||
|
||||
## Related Patterns
|
||||
|
||||
The relationships between this pattern and others. These may be predecessor patterns, whose resulting contexts correspond to the initial context of this one; or successor patterns, whose initial contexts correspond to the resulting context of this one; or alternative patterns, which describe a different solution to the same problem, but under different forces; or co-dependent patterns, which may/must be applied along with this pattern.
|
||||
|
||||
## Known Uses
|
||||
|
||||
Known applications of the pattern within existing systems, verifying that the pattern does indeed describe a proven solution to a recurring problem. Known Uses can also serve as Examples.
|
||||
|
|
|
|||
61
samples.md
61
samples.md
|
|
@ -1,15 +1,13 @@
|
|||
# Sample execution
|
||||
|
||||
A number of samples are provided to demonstrate what the **martiLQ** documents
|
||||
look like and how simple the exceution can be.
|
||||
|
||||
For the BSB (Bank State Branch) samples below, you will first need fetch the files for
|
||||
lcoal processing. See TBA
|
||||
look like and how simple the execution can be.
|
||||
|
||||
## Python
|
||||
|
||||
If you have the required Python software and packages installed, and have Internet
|
||||
then the following commands should generate output for you.
|
||||
then the following commands will generate output for you. If you use
|
||||
a proxy, then there can be issues.
|
||||
|
||||
Open a terminal with the current directory set to the project root (here)
|
||||
|
||||
|
|
@ -20,12 +18,12 @@ Open a terminal with the current directory set to the project root (here)
|
|||
.\source\python\client\martiLQ.py -t GEN -s "./docs/source/samples/python/test/http/" -o "./test/python/results/test_proc_bsb.json" -c ./docs/source/samples/json/sample_bsb.ini -u http://apnedata.merebox.com.s3.ap-southeast-2.amazonaws.com/au/bsb/
|
||||
```
|
||||
|
||||
For details using Python samples see
|
||||
There are also a number of Python test scripts you can execute
|
||||
|
||||
## Powershell
|
||||
|
||||
If you have the required PowerShell software and packages installed, and have Internet
|
||||
then the following commands should generate output for you.
|
||||
access then the following commands will generate output for you.
|
||||
|
||||
Open a terminal with the current directory set to the project root (here)
|
||||
|
||||
|
|
@ -33,18 +31,19 @@ The PowerShell command
|
|||
|
||||
```ps1
|
||||
|
||||
.\test\powershell\martiLQ_base_test.ps1
|
||||
|
||||
# This sample will retrieve a number of CKAN files from
|
||||
# Australian government sites to demonstrate conversion
|
||||
.\test\powershell\test_MartiLQCkan.ps1
|
||||
# Australian government and Singapore sites to demonstrate conversion
|
||||
.\test\powershell\martiLQ_ckan_test.ps1
|
||||
|
||||
```
|
||||
|
||||
For details using PowerShell samples see
|
||||
|
||||
## Go
|
||||
|
||||
If you have the required GOLANG software and packages installed, and have Internet
|
||||
then the following commands should generate output for you.
|
||||
access then the following commands will generate output for you. If you use
|
||||
a proxy, then there can be issues.
|
||||
|
||||
Open a terminal with the current directory set to the project root (here)
|
||||
|
||||
|
|
@ -61,7 +60,7 @@ go run . -- -t GEN -m %MARTILQ_PROJECT_PATH%/test/golang/results/test_proc_bsb.j
|
|||
cd %MARTILQ_PROJECT_PATH%
|
||||
```
|
||||
|
||||
A PowerShell script to execute
|
||||
A PowerShell script to execute Go program
|
||||
|
||||
```ps1
|
||||
$env:MARTILQ_PROJECT_PATH=Get-Location
|
||||
|
|
@ -81,39 +80,3 @@ go run . -- -t MAKE -m $mfile -c $cfile -s $spath --title "GEN005" --description
|
|||
|
||||
Set-Location -Path $env:MARTILQ_PROJECT_PATH -PassThru
|
||||
```
|
||||
|
||||
For details using Go samples see
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
go run . -- -t GEN -m ./test/test_main_doc_Sample01.json -s ./docs/source/martilq.md --title "GEN001" --description "Simple example with no config"
|
||||
|
||||
|
||||
go run . -- -t GEN -m ./test/test_main_doc_Sample02.json -c ./docs/source/samples/json/GEN002.ini -s ./docs/source/martilq.md --title "GEN002" --description "Simple example"
|
||||
|
||||
|
||||
go run . -- -t GEN -m ./test/test_main_doc_Sample03.json -c ./docs/source/samples/json/GEN002.ini -s ./docs/source/ --title "GEN003" --description "Directory example"
|
||||
|
||||
|
||||
go run . -- -t GEN -m ./test/test_main_doc_Sample04.json -s ./docs/source/ --title "GEN004" --description "Directory example with filter" -R --filter "r.*\.md"
|
||||
|
||||
|
||||
go run . -- -t GEN -m ./test/test_main_doc_Sample05.json -c ./docs/source/samples/json/GEN005.ini -s C.\docs\source\samples\python\test\http\ --title "GEN005" --description "Directory example for BSB with filter" -R --filter "BSBDirectory.*\.csv"
|
||||
|
||||
|
||||
https://github.com/meerkat-manor/marti/blob/draft_specifications/docs/source/martiLQ.md
|
||||
https://github.com/meerkat-manor/marti/blob/draft_specifications/docs/source/martiLQ.md
|
||||
|
||||
|
||||
SET GO_PROJECT_PATH=
|
||||
|
||||
go run . -- -t GEN -m ./test/test_main_doc_Sample05.json -c ./docs/source/samples/json/GEN005.ini -s .\docs\source\samples\python\test\http\ --title "GEN005" --description "Directory example for BSB with filter" -R --filter "BSBDirectory.*\.csv"
|
||||
|
||||
|
||||
.\source\python\client\martiLQ.py -t GET -s "./test/python/results/data" -o "./test/python/results/test_proc_bsb.json"
|
||||
|
|
|
|||
|
|
@ -152,7 +152,7 @@ func (c *configuration) SaveConfig(ConfigPath string) bool {
|
|||
cfgini.Section("General").Key("logPath").SetValue (c.logPath)
|
||||
cfgini.Section("General").Key("tempPath").SetValue (c.tempPath)
|
||||
cfgini.Section("General").Key("dataPath").SetValue (c.dataPath)
|
||||
cfgini.Section("General").Key("dateFormat").SetValue (c.datdateFormataPath)
|
||||
cfgini.Section("General").Key("dateFormat").SetValue (c.dateFormat)
|
||||
cfgini.Section("General").Key("dateTimeFormat").SetValue (c.dateTimeFormat)
|
||||
|
||||
cfgini.Section("MartiLQ").Key("tags").SetValue(c.tags)
|
||||
|
|
|
|||
|
|
@ -1,2 +1,2 @@
|
|||
|
||||
package martilq
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,284 @@
|
|||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"strings"
|
||||
"merebox.com/martilq"
|
||||
"io/ioutil"
|
||||
)
|
||||
|
||||
|
||||
type Parameters struct {
|
||||
|
||||
help bool
|
||||
|
||||
task string
|
||||
sourcePath string
|
||||
recursive bool
|
||||
filter string
|
||||
update bool
|
||||
urlPrefix string
|
||||
configPath string
|
||||
definitionPath string
|
||||
outputPath string
|
||||
|
||||
title string
|
||||
description string
|
||||
describedBy string
|
||||
landing string
|
||||
}
|
||||
|
||||
var params Parameters
|
||||
|
||||
|
||||
func loadArguments(args []string) {
|
||||
|
||||
maxArgs := len(args)
|
||||
ix := 1
|
||||
for ix < maxArgs {
|
||||
matched := false
|
||||
|
||||
if args[ix] == "-h" || args[ix] == "--help" {
|
||||
matched = true
|
||||
params.help = true
|
||||
break
|
||||
}
|
||||
|
||||
if args[ix] == "-t" || args[ix] == "--task" {
|
||||
matched = true
|
||||
if ix < maxArgs {
|
||||
ix = ix + 1
|
||||
params.task = strings.ToUpper(args[ix])
|
||||
} else {
|
||||
panic("Missing parameter for TASK")
|
||||
}
|
||||
}
|
||||
|
||||
if args[ix] == "-c" || args[ix] == "--config" {
|
||||
matched = true
|
||||
ix = ix + 1
|
||||
if ix < maxArgs {
|
||||
params.configPath = args[ix]
|
||||
} else {
|
||||
panic("Missing parameter for CONFIG")
|
||||
}
|
||||
}
|
||||
|
||||
if args[ix] == "-s" || args[ix] == "--source" {
|
||||
matched = true
|
||||
ix = ix + 1
|
||||
if ix < maxArgs {
|
||||
params.sourcePath = args[ix]
|
||||
} else {
|
||||
panic("Missing parameter for SOURCE")
|
||||
}
|
||||
}
|
||||
|
||||
if args[ix] == "-m" || args[ix] == "--martilq" {
|
||||
matched = true
|
||||
ix = ix + 1
|
||||
if ix < maxArgs {
|
||||
params.definitionPath = args[ix]
|
||||
} else {
|
||||
panic("Missing parameter for MARTILQ")
|
||||
}
|
||||
}
|
||||
|
||||
if args[ix] == "-o" || args[ix] == "--output" {
|
||||
matched = true
|
||||
ix = ix + 1
|
||||
if ix < maxArgs {
|
||||
params.outputPath = args[ix]
|
||||
} else {
|
||||
panic("Missing parameter for OUTPUT")
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
||||
if args[ix] == "-R" || args[ix] == "--recursive" {
|
||||
matched = true
|
||||
params.recursive = true
|
||||
}
|
||||
|
||||
if args[ix] == "--update" {
|
||||
matched = true
|
||||
params.update = true
|
||||
}
|
||||
|
||||
|
||||
if args[ix] == "--title" {
|
||||
matched = true
|
||||
if ix < maxArgs {
|
||||
ix = ix + 1
|
||||
params.title = args[ix]
|
||||
} else {
|
||||
panic("Missing parameter for TITLE")
|
||||
}
|
||||
}
|
||||
|
||||
if args[ix] == "--filter" {
|
||||
matched = true
|
||||
if ix < maxArgs {
|
||||
ix = ix + 1
|
||||
params.filter = args[ix]
|
||||
} else {
|
||||
panic("Missing parameter for FILTER")
|
||||
}
|
||||
}
|
||||
|
||||
if args[ix] == "--description" {
|
||||
matched = true
|
||||
if ix < maxArgs {
|
||||
ix = ix + 1
|
||||
if args[ix][0] == '@' {
|
||||
desc, err := ioutil.ReadFile(args[ix][1:])
|
||||
if err != nil {
|
||||
panic("Description file not found: " + args[ix][1:])
|
||||
}
|
||||
params.description = string(desc)
|
||||
} else {
|
||||
params.description = args[ix]
|
||||
}
|
||||
} else {
|
||||
panic("Missing parameter for DECRIPTION")
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
if !matched && args[ix] != "--" {
|
||||
fmt.Println("Unrecognised command line argument: " + args[ix])
|
||||
}
|
||||
|
||||
ix = ix + 1
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
func printHelp() {
|
||||
|
||||
fmt.Println("")
|
||||
fmt.Println("\t martilqcli_client ")
|
||||
fmt.Println("\t =======++======== ")
|
||||
fmt.Println("")
|
||||
fmt.Println("\tThis program is intended as a simple reference implementation")
|
||||
fmt.Println("\tin Go of the MartiLQ framework. It is does not provide all")
|
||||
fmt.Println("\tthe possible functionality but enough to demonstrate the concept.")
|
||||
fmt.Println("")
|
||||
|
||||
fmt.Println(" The command line arguments are:")
|
||||
fmt.Println("")
|
||||
fmt.Println(" -h or --help : Display this help")
|
||||
fmt.Println(" -t or --task : Execute a predefined task which are")
|
||||
fmt.Println(" INIT initialise a new configuration file")
|
||||
fmt.Println(" MAKE make a MartiLQ definition file")
|
||||
fmt.Println(" GET resources based on MartiLQ definition file")
|
||||
fmt.Println(" RECON reconicile a MartiLQ definition file")
|
||||
fmt.Println(" -c or --config : Configuration file used by all tasks")
|
||||
fmt.Println(" This is the file written by the INIT task")
|
||||
fmt.Println(" -s or --source : Source directory or file to build MartiLQ definition")
|
||||
fmt.Println(" This is used by the MAKE and RECON task")
|
||||
fmt.Println(" -m or --martilq : MartiLQ definition file")
|
||||
fmt.Println(" This is used by the MAKE and RECON task")
|
||||
fmt.Println(" The MAKE task makes the file while")
|
||||
fmt.Println(" RECON task reads the file")
|
||||
fmt.Println(" -o or --output : Output file")
|
||||
fmt.Println(" This is used by the RECON task")
|
||||
|
||||
fmt.Println("")
|
||||
fmt.Println(" --title : Title for the MartiLQ. Think of this as")
|
||||
fmt.Println(" the job name")
|
||||
fmt.Println(" This is used by the MAKE task")
|
||||
fmt.Println(" --description : Description for the MartiLQ. This can be text")
|
||||
fmt.Println(" or a pointer to a file when the @ prefix is used")
|
||||
fmt.Println(" This is used by the MAKE task")
|
||||
fmt.Println(" --Update : Update existing definition otherwise fail it exists already")
|
||||
fmt.Println(" This is used by the MAKE task")
|
||||
fmt.Println(" --filter : File filter")
|
||||
fmt.Println(" This is used by the MAKE task")
|
||||
fmt.Println(" -R or --recursive : Recursively process child folders")
|
||||
fmt.Println(" This is used by the MAKE task")
|
||||
|
||||
fmt.Println("")
|
||||
|
||||
}
|
||||
|
||||
func main () {
|
||||
|
||||
currentDirectory, _ := os.Getwd()
|
||||
params.sourcePath = currentDirectory
|
||||
|
||||
loadArguments(os.Args)
|
||||
|
||||
matched := false
|
||||
|
||||
if params.help {
|
||||
printHelp()
|
||||
} else {
|
||||
|
||||
|
||||
if params.task == "INIT" {
|
||||
if params.configPath == "" {
|
||||
panic("Missing 'config' parameter")
|
||||
}
|
||||
|
||||
_, err := os.Stat(params.configPath)
|
||||
if err == nil {
|
||||
panic("MartiLQ configuration file '"+ params.configPath+"' already exists")
|
||||
}
|
||||
|
||||
c := martilq.NewConfiguration()
|
||||
if c.SaveConfig(params.configPath) != true {
|
||||
panic("Configuration not saved to: "+ params.configPath)
|
||||
}
|
||||
fmt.Println("Created MARTILQ INI definition: " + params.configPath)
|
||||
matched = true
|
||||
}
|
||||
|
||||
if params.task == "MAKE" {
|
||||
|
||||
if params.sourcePath == "" {
|
||||
panic("Missing 'source' parameter")
|
||||
}
|
||||
if params.definitionPath == "" {
|
||||
panic("Missing 'output' parameter")
|
||||
}
|
||||
|
||||
_, err := os.Stat(params.definitionPath)
|
||||
if err == nil && params.update == false {
|
||||
panic("MartiLQ document '"+ params.definitionPath+"' already exists and update not specified")
|
||||
}
|
||||
|
||||
m := martilq.Make(params.configPath, params.sourcePath, params.filter, params.recursive, params.urlPrefix, params.definitionPath )
|
||||
if params.title != "" {
|
||||
m.Title = params.title
|
||||
}
|
||||
if params.description != "" {
|
||||
m.Description = params.description
|
||||
}
|
||||
m.Save(params.definitionPath)
|
||||
|
||||
fmt.Println("Created MARTILQ definition: " + params.definitionPath)
|
||||
matched = true
|
||||
}
|
||||
|
||||
if params.task == "GET" {
|
||||
fmt.Println("ET task not implemented")
|
||||
matched = true
|
||||
}
|
||||
|
||||
|
||||
if params.task == "RECON" {
|
||||
|
||||
_ = martilq.ReconcileFilePath(params.configPath, params.sourcePath, params.recursive, params.definitionPath, params.outputPath )
|
||||
|
||||
matched = true
|
||||
}
|
||||
|
||||
if !matched {
|
||||
printHelp()
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
4
tools.md
4
tools.md
|
|
@ -1,8 +1,8 @@
|
|||
# Tools
|
||||
|
||||
A number of tools are povided that can be incorporated into your
|
||||
A number of tools are provided that can be incorporated into your
|
||||
projects that want to use the metadata transfer reconciliation format
|
||||
(martiLQ document).
|
||||
(**martiLQ** document).
|
||||
|
||||
The Python or PowerShell (Windows or Linux) scripts can be
|
||||
inserted into your processing pipeline either to pack or
|
||||
|
|
|
|||
Loading…
Reference in New Issue