Reviewing and corecting documentation

draft_specifications
meerkat 2021-12-12 23:14:23 +11:00
parent d5d4df7874
commit 9fdd18faae
16 changed files with 437 additions and 135 deletions

View File

@ -1,13 +1,13 @@
# Documentation # Documentation
**martiLQ** stands for metadata reconcilation for transfer information, load quality **martiLQ** stands for metadata reconciliation for transfer information, load quality
Before starting with **martiLQ** it is advisable to understand if it is right for Before starting with **martiLQ** it is advisable to understand if it is right for
your organisation's needs. Information is available in a number of short your organization's needs. Information is available in a number of short
documents. documents.
There is no quickstart document to get you started as each use case and There is no quick start document to get you started as each use case and
organisation is different. There are sample implementations which you organization is different. There are sample implementations which you
can adjust if they resonate with your circumstances, can adjust if they resonate with your circumstances,
see [sample implementations](samples/) see [sample implementations](samples/)

View File

@ -1,7 +1,7 @@
# Acknowledgment # Acknowledgment
Once the **martiLQ** document is received by a consumer then communicating the receipt, processing, Once the **martiLQ** document is received by a consumer then communicating the receipt, processing,
success or failure completes the feedback loop and builds an extra layer of assurance for the organisation. success or failure completes the feedback loop and builds an extra layer of assurance for the organization.
The acknowledgement workflow provides the necessary feedback. If an acknowledgement is required as part of the The acknowledgement workflow provides the necessary feedback. If an acknowledgement is required as part of the
consumption design then the following is approach is recommended. consumption design then the following is approach is recommended.
@ -9,41 +9,55 @@ consumption design then the following is approach is recommended.
1. The publisher provides callback details. For extra security the callback details should be signed. 1. The publisher provides callback details. For extra security the callback details should be signed.
2. The consumer will acknowledge the receipt of the **martiLQ** document by sending back the same 2. The consumer will acknowledge the receipt of the **martiLQ** document by sending back the same
document to the publisher with some values changed. document to the publisher with some values changed.
3. Change the root consumer and state (not resource) from ``active`` to ``receipt``. 3. Change the root consumer and state (not resource) from ``active`` to ``receipt`` and modify the ``stateModified`` timestamp.
4. Change the ``consumer`` data value to only be your identifier and not others, so that the publisher 4. Change the ``consumer`` data value to only be your identifier and not others, so that the publisher
can identify the consumer and associate it with success or failure. This change to consumer value can identify the consumer and associate it with success or failure. This change to consumer value
applies to all subsequent acknowledgement messages. applies to all subsequent acknowledgement messages.
5. Send the changed **martiLQ** document back using the callback details 5. Send the changed **martiLQ** document back using the callback details
6. On fetching each resource the resource state is changed from ``active`` to ``received``. If any resource 6. On fetching each resource the resource state is changed from ``active`` to ``received`` and modify the ``stateModified`` timestamp.
cannot be retrieved the state is changed from ``active`` to ``missing``. If any resource cannot be retrieved the state is changed from ``active`` to ``missing`` and ``stateModified`` timestamp is udpated.
7. The consumer can elect to send back the **martiLQ** document to the publisher on each fetch or at the completion 7. The consumer can elect to send back the **martiLQ** document to the publisher on each fetch or at the completion
of all fetches. The recommendation is to send at the end of all fetches because if there are issues then of all fetches. The recommendation is to send at the end of all fetches because if there are issues then
having all the failures for analysis should assist in determining the extent of the failure. having all the failures for analysis should assist in determining the extent of the failure.
8. Once all resources are fetched (or failed), the root state is changed from ``receipt`` to ``received`` if no 8. Once all resources are fetched (or failed), the root state is changed from ``receipt`` to ``received`` and update the ``stateModified``
errors occurred in retrieving the resources. If a single or many errors occurred, then the root state is timestamp if no errors occurred in retrieving the resources. If a single or many errors occurred, then the root state is
changed from ``receipt`` to ``missing``. The updated document is sent back to the publisher using changed from ``receipt`` to ``missing`` and the ``stateModified`` timestamp is updated. The updated document is sent back
the callback details. to the publisher using the callback details.
9. The next stage is to validate and process the resources defined in the **martiLQ** document. This follows 9. The next stage is to validate and process the resources defined in the **martiLQ** document. This follows
a similar process to fetching the resources. a similar process to fetching the resources.
10. On processing each resource the resource state is changed from ``received`` to ``processed``. If any resource 10. On processing each resource the resource state is changed from ``received`` to ``processed`` and modify the ``stateModified`` timestamp.
cannot be processed the state is changed from ``received`` to ``error``. Once again this can be acknowledged If any resource cannot be processed the state is changed from ``received`` to ``error`` and update the ``stateModified`` timestamp. Once
back to the publisher. again this can be acknowledged back to the publisher.
11. Once all resources are processed (or failed), the root state is changed from ``received`` to ``processed`` if no 11. Once all resources are processed (or failed), the root state is changed from ``received`` to ``processed`` and the ``stateModified`` timestamp
errors occurred in processing the resources. If a single or many errors occurred, then the root state is updated if no errors occurred in processing the resources. If a single or many errors occurred, then the root state is
changed from ``received`` to ``error``. The updated document is sent back to the publisher using changed from ``received`` to ``error`` and the ``stateModified`` timestamp modified. The updated document is sent back to the publisher using
the callback details. the callback details.
This completes the acknowledgment workflow for the **martiLQ** document. The level of acknowledgement feedback This completes the acknowledgment workflow for the **martiLQ** document. The level of acknowledgement feedback
you wish to implement as a consumer is your decision. Any publisher providing callback details for acknowledgement can also you wish to implement as a consumer is your decision. Any publisher providing callback details for acknowledgement can also
choose their behaviour on actions and recording any acknowledgments received. choose their behavior on actions and recording any acknowledgments received.
In the above acknowledgement process, you **must not** change the identifiers in the **martiLQ** document and you **should not** In the above acknowledgement process, you **must not** change the identifiers in the **martiLQ** document and you **should not**
change other data except the ``consumer`` and ``state`` and ``stateModified``. change other data except the ``consumer`` and ``state`` and ``stateModified``.
If you are the publisher and expect acknowledgment then there is an extra scenario you need to cater for. The scenario is If you are the publisher and expect acknowledgment then there is an extra scenario you need to cater for. The scenario is
that you do not recieve any acknowledgement back from the expected consumer(s) within the agreed timeframe. In this situation that you do not receive any acknowledgement back from the expected consumer(s) within the agreed time frame. In this situation
the publisher will need to know each consumer and their service level agreements. the publisher will need to know each consumer and their service level agreements.
## Callback
The callback method can take any form. The normal expectation is that the same method used to communicate the **martiLQ**
document is used for the callback for example:
* If the **martiLQ** document is originally sent by Kafka queue then the callback should use a Kafka queue separate topic
* If the **martiLQ** document is originally sent by REST API then the callback is a publisher REST API end point
* If the **martiLQ** document is originally sent by email then the callback uses a reply address for the callback
While the above is the expected convention, you can mix and match. For example:
* If the **martiLQ** document is originally sent by REST API then the callback can be an email address. This
situation could be acceptable if the publisher does not have the ability to accept REST API call backs.
## Compressed file handling ## Compressed file handling
When the **martiLQ** document is defining a parent compressed file, e.g. ZIP or 7Z, then the resources are expected When the **martiLQ** document is defining a parent compressed file, e.g. ZIP or 7Z, then the resources are expected
@ -52,7 +66,3 @@ state of the resource is still changed to reflect the processing.
If the file cannot be extracted either because it has not been included or there is a decompression error, then the If the file cannot be extracted either because it has not been included or there is a decompression error, then the
same acknowledgement process of using the state is used. same acknowledgement process of using the state is used.
## Error situations

View File

@ -1,11 +1,10 @@
Comparison of martiLQ document definition # Comparison of martiLQ document definition
=========================================
The use of metadata definitions is not unique and examples The use of metadata definitions is not unique and examples
exist in many different situations. Some are standard and open exist in many different situations. Some are standard and open
while others are closed. while others are closed.
Some open standards are EXIF data for pictures, SQL DDL defintions Some open standards are EXIF data for pictures, SQL DDL definitions
for databases, the XMP definition and web header responses before the for databases, the XMP definition and web header responses before the
web content. web content.
@ -23,18 +22,16 @@ a CKAN source into a **martiLQ** document definition and then process the data
through the pipeline as you would for any other data file that had a through the pipeline as you would for any other data file that had a
**martiLQ** document definition. **martiLQ** document definition.
Benefit of CKAN and martiLQ ## Benefit of CKAN and martiLQ
---------------------------
The CKAN is excellent at defining the data source details but it lacks information The CKAN is excellent at defining the data source details but it lacks information
for load quality. If you have CKAN deployed in your organisation and wish for load quality. If you have CKAN deployed in your organization and wish
exhange or process the data referenced in CKAN, then there are synergies between exhange or process the data referenced in CKAN, then there are synergies between
CKAN and marti. CKAN and marti.
Samples exist on CKAN integration. Samples exist on CKAN integration.
Magda and martiLQ ## Magda and martiLQ
-----------------
Another source of data is [Magda](https://magda.io/) which has API metadata Another source of data is [Magda](https://magda.io/) which has API metadata
definitions. Magda is more about data federation and as such provides definitions. Magda is more about data federation and as such provides

View File

@ -1,7 +1,7 @@
Custom Definition Custom Definition
================= =================
The custome definition section allows the inclusion of extensions The custom definition section allows the inclusion of extensions
to the standard. To demonstrate the inclusion, there are three to the standard. To demonstrate the inclusion, there are three
sample extensions. These are: sample extensions. These are:

View File

@ -1,5 +1,5 @@
Magda definitions # Magda definitions
=================
https://magda.io/ https://magda.io/

View File

@ -39,7 +39,7 @@ modified|Modified date and time of the **martiLQ** document|Now
tags|List of tags or keywords| tags|List of tags or keywords|
publisher|Publisher name| publisher|Publisher name|
contactPoint|Contact point of a person or team| contactPoint|Contact point of a person or team|
accessLevel|Acces level| accessLevel|Access level|
rights|Rights| rights|Rights|
* Batch * Batch
@ -54,7 +54,7 @@ rights|Rights|
### Information extension ### Information extension
The information supplied can be extended by party agreement and there The information supplied can be extended by party agreement and there
are place holders in the defintion. are place holders in the definition.
## Resource ## Resource
@ -62,7 +62,7 @@ The resource section is a list of documents or files that are to be grouped
together are listed under the same **martiLQ** definition. together are listed under the same **martiLQ** definition.
At least one document or file must be included. If the same resource is repeated At least one document or file must be included. If the same resource is repeated
it will commonly be for definiting multiple formats, with each file having a it will commonly be defining multiple formats, with each file having a
different extension. Commonly the definition includes at least the following different extension. Commonly the definition includes at least the following
items: items:
@ -84,7 +84,7 @@ for more details
Name|Description|Default or values Name|Description|Default or values
---|---|-- ---|---|--
hash|Hash of document - The hash of the document, which can be blank especially for large documents hash|Hash of document - The hash of the document, which can be blank especially for large documents
algo|Hash algorithm - Algoroithm used to generate the hash value or sign it algo|Hash algorithm - Algorithm used to generate the hash value or sign it
description|Description - A more detailed description description|Description - A more detailed description
version|Version - A document version version|Version - A document version
encoding|Encoding encoding|Encoding

View File

@ -19,7 +19,7 @@ load quality metrics.
* Number of records in the document - This is the number of data primary records not the * Number of records in the document - This is the number of data primary records not the
count of end of lines and is agreed between parties. XML record counts could be based count of end of lines and is agreed between parties. XML record counts could be based
on the number of primary segments under root. JSON records can be counted in a similar way. on the number of primary segments under root. JSON records can be counted in a similar way.
The headers or trailling records are not counted The headers or trailing records are not counted
## Addresses deficiencies ## Addresses deficiencies

View File

@ -1,7 +1,7 @@
# References # References
The following are references to documents that inspired the creation of **martiLQ** The following are references to documents that inspired the creation of **martiLQ**
document and associatd framework. document and associated framework.
https://dex.dss.gov.au/sites/default/files/documents/2021-06/data-exchange-protocols-june-2021.pdf https://dex.dss.gov.au/sites/default/files/documents/2021-06/data-exchange-protocols-june-2021.pdf

View File

@ -12,7 +12,7 @@ for various programming languages and situations. As many programming languages
generate portable programs that can execute on multiple operating systems, the generate portable programs that can execute on multiple operating systems, the
likelihood is that a tools exists for you. likelihood is that a tools exists for you.
The source for tools is provided in the Github repository and some have precompiled The source for tools is provided in the Github repository and some have pre-compiled
images. images.
See the project source directory for more details. See the project source directory for more details.

View File

@ -1,13 +1,11 @@
Who is likely to use martiLQ # Who is likely to use martiLQ
============================
You are likely to find the **martiLQ** framework relevant if you: You are likely to find the **martiLQ** framework relevant if you:
1. Have many document exchanges, such as End of Day batches 1. Have many document exchanges, such as End of Day batches
2. Need to verify or reconcile the documents 2. Need to verify or reconcile the documents
Data exchanges ## Data exchanges
--------------
If you are creating or receiving many documents or files on a regular basis If you are creating or receiving many documents or files on a regular basis
then you probably have some framework defined. The framework may be as simple as: then you probably have some framework defined. The framework may be as simple as:
@ -24,8 +22,7 @@ Simple framework such as the above have limitations, such as:
* Lower automation prospects and alignment to DataSecOps * Lower automation prospects and alignment to DataSecOps
* Poor fit to web applications (they tend to be designed for FTP and LAN) * Poor fit to web applications (they tend to be designed for FTP and LAN)
Framework Sidecar files ## Framework Sidecar files
-----------------------
The **martiLQ** framework addresses the issues and limitations by using sidecar The **martiLQ** framework addresses the issues and limitations by using sidecar
or shadow files. The [concept of sidecar files](https://en.wikipedia.org/wiki/Sidecar_file) is or shadow files. The [concept of sidecar files](https://en.wikipedia.org/wiki/Sidecar_file) is
@ -35,7 +32,7 @@ Sidecar files can also be implemented as ``forks`` and built into the operating
in Mac OS X HFS. The Microsoft NTFS supports Alternate Data Streams to achieve a similar outcome. in Mac OS X HFS. The Microsoft NTFS supports Alternate Data Streams to achieve a similar outcome.
Unfortunately this information is not transferrable to other systems. Unfortunately this information is not transferrable to other systems.
The proposition is to define a format for the sidecare file and provide common library tools that The proposition is to define a format for the sidecar file and provide common library tools that
can be be used on multiple platforms when exchanging documents / files. Multiple documents can be can be be used on multiple platforms when exchanging documents / files. Multiple documents can be
defined in a singel **martiLQ** definition which adds to efficiency and productivity if used defined in a single **martiLQ** definition which adds to efficiency and productivity if used
for End of Day or similar batches - or even single file transfers. for End of Day or similar batches - or even single file transfers.

View File

@ -6,8 +6,8 @@
and intended to be consumed by another system component with self-describing information with and intended to be consumed by another system component with self-describing information with
load assurance metrics. load assurance metrics.
The consuming system component can be at the same location, a dfifferent geographical location, The consuming system component can be at the same location, a different geographical location,
the same organisation or another organisation. the same organization or another organization.
The pattern does not define the format that the data file or document must take or how the data is transferred The pattern does not define the format that the data file or document must take or how the data is transferred
or accessed. You choose the data format and transfer method. Once you have made the choice, you can describe or accessed. You choose the data format and transfer method. Once you have made the choice, you can describe
@ -23,7 +23,7 @@ to demonstrate generating the **martiLQ** document.
## Problem statement ## Problem statement
Even though event streaming is a stragetic goal for many organisations, there exists legcay processes and there Even though event streaming is a stragetic goal for many organizations, there exists legacy processes and there
will continue to be a need to transfer data flies and other documents from one system to another. will continue to be a need to transfer data flies and other documents from one system to another.
When a handover of a data file or document occurs, the best practice is to include metrics with the transfer When a handover of a data file or document occurs, the best practice is to include metrics with the transfer
@ -31,14 +31,14 @@ to assure the recipient of provenance and quality of the data file or document.
with the data file or document. with the data file or document.
A document includes unstructered data, letters, pictures, binary objects while data files could be though of A document includes unstructered data, letters, pictures, binary objects while data files could be though of
as strutured data that is describes multiple records. as structured data that is describes multiple records.
### Assurance Problem ### Assurance Problem
**How does the recipient know they have received all related files, the provenance, it is immutable and **How does the recipient know they have received all related files, the provenance, it is immutable and
assurance on quality?** assurance on quality?**
Many organisations have used the file name as the carrier of this information but this has limits. Many organizations have used the file name as the carrier of this information but this has limits.
## Efficiency Problem ## Efficiency Problem
@ -54,7 +54,7 @@ Therefore the objective is to produce a documentation standard that:
1. provides load assurance when transferring data files and documents 1. provides load assurance when transferring data files and documents
2. can be tooled and therefore achieve some level of automation 2. can be tooled and therefore achieve some level of automation
3. is extensible to give the publisher and consumer control as to the level of assurance 3. is extensible to give the publisher and consumer control as to the level of assurance
required to match the risk appetite of the organisation required to match the risk appetite of the organization
## Context ## Context
@ -84,34 +84,64 @@ as they are considered the minimal for best practice
* Format, encoding, compression * Format, encoding, compression
* Data record count * Data record count
There is an acknowledgment processs that is recommended for confirmation on processing. See There is an acknowledgment process that is recommended for confirmation on processing. See
[acknowldegment](docs/source/acknowledgement.md) for approach details. [acknowledgement](docs/source/acknowledgement.md) for approach details.
## Forces ## Forces
The qualities that this pattern is addressing... The qualities that this pattern is addressing are:
1. Frees transfer from file naming convention that include magic strings that store metadata
2. A event message based paradigm which is independent of the size and number of files
3. Publishes basic metadata on the files and their source
4. Secures the file transfer from tampering or corruption
5. Allows the inclusion of quality metrics such as provenance, elements and record counts
6. Allows the consumer to select the files to process avoiding unnecessary transfers
7. Provides a simple acknowledgment process
8. Is extendable
The file transfer pattern is the original method for separate processes to exchange data. The file being stored on magnetic tape and either The file transfer pattern is the original method for separate processes to exchange data. The file being stored on magnetic tape and either
loaded back onto the same compute resource (think mainframe) or physicaly couriered to another lcoation or tape drive. The loaded back onto the same compute resource (think mainframe) or physically couriered to another location or tape drive. The
reference book [Enterprise Integration Patterns](https://www.enterpriseintegrationpatterns.com/patterns/messaging/FileTransferIntegration.html) reference book [Enterprise Integration Patterns](https://www.enterpriseintegrationpatterns.com/patterns/messaging/FileTransferIntegration.html)
by Hohpe and Woolf recognises this by inculsion of the pattern written by Martin Fowler. by Hohpe and Woolf recognizes this by inclusion of the pattern written by Martin Fowler.
This pattern addresess the issues and concerns that relate to file transfer. Many of these are related the the common In the explanation written by Martin Fowler, he makes observations about the "File Transfer" including:
"Part of what makes _File Transfer_ simple is that no extra tools or integration packages are needed, but that also means that developers
have to do a lot of the work themselves. The applications must agree om file-naming conventions and the directories they appear. ...
, then some application must take responsibility for transferring the file form one disk to another"
The pattern being described here addresses the issues and concerns that relate to file transfer. Many of these are related to the common
non functional requirements that architects cover in solution designs. non functional requirements that architects cover in solution designs.
### Security, robustness, reliability, fault-tolerance ### Security, robustness, reliability, fault-tolerance
The pattern defines how security and assurance is applied to the data files and documents. The pattern does The pattern defines how security and assurance is applied to the data files and documents. The pattern does
not define how to setup a reliable infrastructure, but it can be used to detect failures not define how to setup a reliable infrastructure, but it can be used to detect failures
in the infrastructire. The fault-tolerance allowance is up to each implementation. in the infrastructure. The fault-tolerance allowance is up to each implementation.
Fault-tolerance and the actionable task can be dialled from 0% tolerance to 100% tolerance on a Fault-tolerance and the actionable task can be dialled from 0% tolerance to 100% tolerance on a
case by case basis. case by case basis.
### Manageability ### Manageability
The pattern takes into consideration of how the file transfer is managed. It can provide a standard
that makes file transfers easier to manage regardless of the underlying transport mechanism. As the
same **martiLQ** document can be consumed by multiple recipients, it is easy to distribute, with
access controls ensuring only authorized recipients can access the files that are relevant to them.
This reduces the need to run multiple jobs distributing the files.
Management is also improved if the acknowledgement capability is implemented so that the publisher
knows which recipient has processed the file. If the recipient no longer wishes the file and stops
processing the publisher will slowly build a time line and metrics to recognize that the file
is no longer consumed. The publisher can then cease to produce unused files.
### Efficiency, performance, throughput, bandwidth requirements, space utilization ### Efficiency, performance, throughput, bandwidth requirements, space utilization
If the process is using event based messaging, files that are not required at the destination
are never transferred. This saves bandwidth and storage at the destination.
### Scalability (incremental growth on-demand) ### Scalability (incremental growth on-demand)
The pattern scalability is not bound to the size of the data files themselves. The pattern can The pattern scalability is not bound to the size of the data files themselves. The pattern can
@ -120,43 +150,64 @@ may be factor in the decision of breaking down to smaller volumes.
### Extensibility, evolvability, maintainability ### Extensibility, evolvability, maintainability
The **martiLQ** document can be customised and can evolve as the market conidtions change. Versioning The **martiLQ** document can be customized and can evolve as the market conditions change. Versioning
is built into the definition and consumers can select which attributes are mandatory for is built into the definition and consumers can select which attributes are mandatory for
processing. processing.
### Modularity, independence, re-usability, openness, composability (plug-and-play), portability ### Modularity, independence, re-usability, openness, composability (plug-and-play), portability
The **martiLQ** document is an open definition that can be used in may file transfer scenarios. You can compose
new functionality on top of the open code.
The **martiLQ** document is portable as is the reference implementation. You can run the Python, PowerShell, Go code
on both Windows and Linux platforms and on different architectures.
### Completeness and correctness ### Completeness and correctness
The **martiLQ** document contains metadata to ensure that all files in a job file transfer are treated
as a package or integral unit. If files are missing then this is recognized early in the process and
the recipient consumer can decide on whether to continue processing or halt.
Additional scope also exists in the **martiLQ** document to add more load quality assurance metrics, which can be
automatically processed to ensure correctness.
### Ease-of-construction ### Ease-of-construction
The **martiLQ** document is a JSON formatted document, making it easy to construct using modern tools.
**Note**: An XML format document is in the backlog for possible implementation.
Using the reference implementation, the organization can implement the pattern into their current process without
requiring extensive builds. The reference implementation has code in various programming languages and can run on
Windows and Linux platforms.
All code is visible and auditable.
**Note**: The reference implementation is not the most efficient code in all situations and there is much room for
improvement. The objective of the reference implementation was to demonstrate the ease of use by scanning
file system directory or converting from another format such as CKAN.
### Ease-of-use ### Ease-of-use
## Solution If you are comfortable with using the reference implementation, you can be generating **martiLQ** documents in short time.
Download the git repository code, review the samples and adjust to scan your directory to generate the **martiLQ** document.
As a simplistic approach you can execute the code in your pipeline after you have created existing files.
A description, using text and/or graphics, of how to achieve the intended goals and objectives. The description should identify both the solution's static structure and its dynamic behavior - the people and computing actors, and their collaborations. The description may include guidelines for implementing the solution. Variants or specializations of the solution may also be described.
## Resulting Context ## Resulting Context
After applying the pattern consistently on file transfers within the organization, the expectation is that you will spend less
time discussing and building the mechanics of file transfer including the polling, monitoring and load assurance. A large portion
which will have been for you as part of the **martiLQ** document and its implementation.
There will also less documentation to review and discuss as the **martiLQ** document will provide standards that developers can
follow such as the encoding and format. This applies to both structured and unstructured data.
The post-conditions after the pattern has been applied. Implementing the solution normally requires trade-offs among competing forces. You will still need to decide fo structured data on how many files, the data columns and records to be included in each file.
This element describes which forces have been resolved and how, and which remain unresolved. It may also indicate other patterns that may be applicable in the new context. (A pattern may be one step in accomplishing some larger goal.) Any such other patterns will be described in detail under Related Patterns.
For unstructured data, such as bundling like documents together, the process is much simpler if you take the approach to create
a folder containing the files and then execute the routine to compress and package all together.
## Examples ## Examples
Please refer to the [documentation](docs/source/README.md) and [samples](docs/source/samples/README.md) Please refer to the [documentation](docs/source/README.md) and [samples](docs/source/samples/README.md)
## Rationale
An explanation/justification of the pattern as a whole, or of individual components within it, indicating how the pattern actually works, and why - how it resolves the forces to achieve the desired goals and objectives, and why this is "good". The Solution element of a pattern describes the external structure and behavior of the solution: the Rationale provides insight into its internal workings.
## Related Patterns
The relationships between this pattern and others. These may be predecessor patterns, whose resulting contexts correspond to the initial context of this one; or successor patterns, whose initial contexts correspond to the resulting context of this one; or alternative patterns, which describe a different solution to the same problem, but under different forces; or co-dependent patterns, which may/must be applied along with this pattern.
## Known Uses
Known applications of the pattern within existing systems, verifying that the pattern does indeed describe a proven solution to a recurring problem. Known Uses can also serve as Examples.

View File

@ -1,15 +1,13 @@
# Sample execution # Sample execution
A number of samples are provided to demonstrate what the **martiLQ** documents A number of samples are provided to demonstrate what the **martiLQ** documents
look like and how simple the exceution can be. look like and how simple the execution can be.
For the BSB (Bank State Branch) samples below, you will first need fetch the files for
lcoal processing. See TBA
## Python ## Python
If you have the required Python software and packages installed, and have Internet If you have the required Python software and packages installed, and have Internet
then the following commands should generate output for you. then the following commands will generate output for you. If you use
a proxy, then there can be issues.
Open a terminal with the current directory set to the project root (here) Open a terminal with the current directory set to the project root (here)
@ -20,12 +18,12 @@ Open a terminal with the current directory set to the project root (here)
.\source\python\client\martiLQ.py -t GEN -s "./docs/source/samples/python/test/http/" -o "./test/python/results/test_proc_bsb.json" -c ./docs/source/samples/json/sample_bsb.ini -u http://apnedata.merebox.com.s3.ap-southeast-2.amazonaws.com/au/bsb/ .\source\python\client\martiLQ.py -t GEN -s "./docs/source/samples/python/test/http/" -o "./test/python/results/test_proc_bsb.json" -c ./docs/source/samples/json/sample_bsb.ini -u http://apnedata.merebox.com.s3.ap-southeast-2.amazonaws.com/au/bsb/
``` ```
For details using Python samples see There are also a number of Python test scripts you can execute
## Powershell ## Powershell
If you have the required PowerShell software and packages installed, and have Internet If you have the required PowerShell software and packages installed, and have Internet
then the following commands should generate output for you. access then the following commands will generate output for you.
Open a terminal with the current directory set to the project root (here) Open a terminal with the current directory set to the project root (here)
@ -33,18 +31,19 @@ The PowerShell command
```ps1 ```ps1
.\test\powershell\martiLQ_base_test.ps1
# This sample will retrieve a number of CKAN files from # This sample will retrieve a number of CKAN files from
# Australian government sites to demonstrate conversion # Australian government and Singapore sites to demonstrate conversion
.\test\powershell\test_MartiLQCkan.ps1 .\test\powershell\martiLQ_ckan_test.ps1
``` ```
For details using PowerShell samples see
## Go ## Go
If you have the required GOLANG software and packages installed, and have Internet If you have the required GOLANG software and packages installed, and have Internet
then the following commands should generate output for you. access then the following commands will generate output for you. If you use
a proxy, then there can be issues.
Open a terminal with the current directory set to the project root (here) Open a terminal with the current directory set to the project root (here)
@ -61,7 +60,7 @@ go run . -- -t GEN -m %MARTILQ_PROJECT_PATH%/test/golang/results/test_proc_bsb.j
cd %MARTILQ_PROJECT_PATH% cd %MARTILQ_PROJECT_PATH%
``` ```
A PowerShell script to execute A PowerShell script to execute Go program
```ps1 ```ps1
$env:MARTILQ_PROJECT_PATH=Get-Location $env:MARTILQ_PROJECT_PATH=Get-Location
@ -81,39 +80,3 @@ go run . -- -t MAKE -m $mfile -c $cfile -s $spath --title "GEN005" --description
Set-Location -Path $env:MARTILQ_PROJECT_PATH -PassThru Set-Location -Path $env:MARTILQ_PROJECT_PATH -PassThru
``` ```
For details using Go samples see
go run . -- -t GEN -m ./test/test_main_doc_Sample01.json -s ./docs/source/martilq.md --title "GEN001" --description "Simple example with no config"
go run . -- -t GEN -m ./test/test_main_doc_Sample02.json -c ./docs/source/samples/json/GEN002.ini -s ./docs/source/martilq.md --title "GEN002" --description "Simple example"
go run . -- -t GEN -m ./test/test_main_doc_Sample03.json -c ./docs/source/samples/json/GEN002.ini -s ./docs/source/ --title "GEN003" --description "Directory example"
go run . -- -t GEN -m ./test/test_main_doc_Sample04.json -s ./docs/source/ --title "GEN004" --description "Directory example with filter" -R --filter "r.*\.md"
go run . -- -t GEN -m ./test/test_main_doc_Sample05.json -c ./docs/source/samples/json/GEN005.ini -s C.\docs\source\samples\python\test\http\ --title "GEN005" --description "Directory example for BSB with filter" -R --filter "BSBDirectory.*\.csv"
https://github.com/meerkat-manor/marti/blob/draft_specifications/docs/source/martiLQ.md
https://github.com/meerkat-manor/marti/blob/draft_specifications/docs/source/martiLQ.md
SET GO_PROJECT_PATH=
go run . -- -t GEN -m ./test/test_main_doc_Sample05.json -c ./docs/source/samples/json/GEN005.ini -s .\docs\source\samples\python\test\http\ --title "GEN005" --description "Directory example for BSB with filter" -R --filter "BSBDirectory.*\.csv"
.\source\python\client\martiLQ.py -t GET -s "./test/python/results/data" -o "./test/python/results/test_proc_bsb.json"

View File

@ -152,7 +152,7 @@ func (c *configuration) SaveConfig(ConfigPath string) bool {
cfgini.Section("General").Key("logPath").SetValue (c.logPath) cfgini.Section("General").Key("logPath").SetValue (c.logPath)
cfgini.Section("General").Key("tempPath").SetValue (c.tempPath) cfgini.Section("General").Key("tempPath").SetValue (c.tempPath)
cfgini.Section("General").Key("dataPath").SetValue (c.dataPath) cfgini.Section("General").Key("dataPath").SetValue (c.dataPath)
cfgini.Section("General").Key("dateFormat").SetValue (c.datdateFormataPath) cfgini.Section("General").Key("dateFormat").SetValue (c.dateFormat)
cfgini.Section("General").Key("dateTimeFormat").SetValue (c.dateTimeFormat) cfgini.Section("General").Key("dateTimeFormat").SetValue (c.dateTimeFormat)
cfgini.Section("MartiLQ").Key("tags").SetValue(c.tags) cfgini.Section("MartiLQ").Key("tags").SetValue(c.tags)

View File

@ -1,2 +1,2 @@
package martilq

View File

@ -0,0 +1,284 @@
package main
import (
"fmt"
"os"
"strings"
"merebox.com/martilq"
"io/ioutil"
)
type Parameters struct {
help bool
task string
sourcePath string
recursive bool
filter string
update bool
urlPrefix string
configPath string
definitionPath string
outputPath string
title string
description string
describedBy string
landing string
}
var params Parameters
func loadArguments(args []string) {
maxArgs := len(args)
ix := 1
for ix < maxArgs {
matched := false
if args[ix] == "-h" || args[ix] == "--help" {
matched = true
params.help = true
break
}
if args[ix] == "-t" || args[ix] == "--task" {
matched = true
if ix < maxArgs {
ix = ix + 1
params.task = strings.ToUpper(args[ix])
} else {
panic("Missing parameter for TASK")
}
}
if args[ix] == "-c" || args[ix] == "--config" {
matched = true
ix = ix + 1
if ix < maxArgs {
params.configPath = args[ix]
} else {
panic("Missing parameter for CONFIG")
}
}
if args[ix] == "-s" || args[ix] == "--source" {
matched = true
ix = ix + 1
if ix < maxArgs {
params.sourcePath = args[ix]
} else {
panic("Missing parameter for SOURCE")
}
}
if args[ix] == "-m" || args[ix] == "--martilq" {
matched = true
ix = ix + 1
if ix < maxArgs {
params.definitionPath = args[ix]
} else {
panic("Missing parameter for MARTILQ")
}
}
if args[ix] == "-o" || args[ix] == "--output" {
matched = true
ix = ix + 1
if ix < maxArgs {
params.outputPath = args[ix]
} else {
panic("Missing parameter for OUTPUT")
}
}
if args[ix] == "-R" || args[ix] == "--recursive" {
matched = true
params.recursive = true
}
if args[ix] == "--update" {
matched = true
params.update = true
}
if args[ix] == "--title" {
matched = true
if ix < maxArgs {
ix = ix + 1
params.title = args[ix]
} else {
panic("Missing parameter for TITLE")
}
}
if args[ix] == "--filter" {
matched = true
if ix < maxArgs {
ix = ix + 1
params.filter = args[ix]
} else {
panic("Missing parameter for FILTER")
}
}
if args[ix] == "--description" {
matched = true
if ix < maxArgs {
ix = ix + 1
if args[ix][0] == '@' {
desc, err := ioutil.ReadFile(args[ix][1:])
if err != nil {
panic("Description file not found: " + args[ix][1:])
}
params.description = string(desc)
} else {
params.description = args[ix]
}
} else {
panic("Missing parameter for DECRIPTION")
}
}
if !matched && args[ix] != "--" {
fmt.Println("Unrecognised command line argument: " + args[ix])
}
ix = ix + 1
}
}
func printHelp() {
fmt.Println("")
fmt.Println("\t martilqcli_client ")
fmt.Println("\t =======++======== ")
fmt.Println("")
fmt.Println("\tThis program is intended as a simple reference implementation")
fmt.Println("\tin Go of the MartiLQ framework. It is does not provide all")
fmt.Println("\tthe possible functionality but enough to demonstrate the concept.")
fmt.Println("")
fmt.Println(" The command line arguments are:")
fmt.Println("")
fmt.Println(" -h or --help : Display this help")
fmt.Println(" -t or --task : Execute a predefined task which are")
fmt.Println(" INIT initialise a new configuration file")
fmt.Println(" MAKE make a MartiLQ definition file")
fmt.Println(" GET resources based on MartiLQ definition file")
fmt.Println(" RECON reconicile a MartiLQ definition file")
fmt.Println(" -c or --config : Configuration file used by all tasks")
fmt.Println(" This is the file written by the INIT task")
fmt.Println(" -s or --source : Source directory or file to build MartiLQ definition")
fmt.Println(" This is used by the MAKE and RECON task")
fmt.Println(" -m or --martilq : MartiLQ definition file")
fmt.Println(" This is used by the MAKE and RECON task")
fmt.Println(" The MAKE task makes the file while")
fmt.Println(" RECON task reads the file")
fmt.Println(" -o or --output : Output file")
fmt.Println(" This is used by the RECON task")
fmt.Println("")
fmt.Println(" --title : Title for the MartiLQ. Think of this as")
fmt.Println(" the job name")
fmt.Println(" This is used by the MAKE task")
fmt.Println(" --description : Description for the MartiLQ. This can be text")
fmt.Println(" or a pointer to a file when the @ prefix is used")
fmt.Println(" This is used by the MAKE task")
fmt.Println(" --Update : Update existing definition otherwise fail it exists already")
fmt.Println(" This is used by the MAKE task")
fmt.Println(" --filter : File filter")
fmt.Println(" This is used by the MAKE task")
fmt.Println(" -R or --recursive : Recursively process child folders")
fmt.Println(" This is used by the MAKE task")
fmt.Println("")
}
func main () {
currentDirectory, _ := os.Getwd()
params.sourcePath = currentDirectory
loadArguments(os.Args)
matched := false
if params.help {
printHelp()
} else {
if params.task == "INIT" {
if params.configPath == "" {
panic("Missing 'config' parameter")
}
_, err := os.Stat(params.configPath)
if err == nil {
panic("MartiLQ configuration file '"+ params.configPath+"' already exists")
}
c := martilq.NewConfiguration()
if c.SaveConfig(params.configPath) != true {
panic("Configuration not saved to: "+ params.configPath)
}
fmt.Println("Created MARTILQ INI definition: " + params.configPath)
matched = true
}
if params.task == "MAKE" {
if params.sourcePath == "" {
panic("Missing 'source' parameter")
}
if params.definitionPath == "" {
panic("Missing 'output' parameter")
}
_, err := os.Stat(params.definitionPath)
if err == nil && params.update == false {
panic("MartiLQ document '"+ params.definitionPath+"' already exists and update not specified")
}
m := martilq.Make(params.configPath, params.sourcePath, params.filter, params.recursive, params.urlPrefix, params.definitionPath )
if params.title != "" {
m.Title = params.title
}
if params.description != "" {
m.Description = params.description
}
m.Save(params.definitionPath)
fmt.Println("Created MARTILQ definition: " + params.definitionPath)
matched = true
}
if params.task == "GET" {
fmt.Println("ET task not implemented")
matched = true
}
if params.task == "RECON" {
_ = martilq.ReconcileFilePath(params.configPath, params.sourcePath, params.recursive, params.definitionPath, params.outputPath )
matched = true
}
if !matched {
printHelp()
}
}
}

View File

@ -1,8 +1,8 @@
# Tools # Tools
A number of tools are povided that can be incorporated into your A number of tools are provided that can be incorporated into your
projects that want to use the metadata transfer reconciliation format projects that want to use the metadata transfer reconciliation format
(martiLQ document). (**martiLQ** document).
The Python or PowerShell (Windows or Linux) scripts can be The Python or PowerShell (Windows or Linux) scripts can be
inserted into your processing pipeline either to pack or inserted into your processing pipeline either to pack or