Fixing and standardising Python code

draft_specifications
meerkat 2021-11-10 19:55:01 +11:00
parent 130f7ad498
commit 6aec64569d
20 changed files with 839 additions and 129 deletions

7
.gitignore vendored
View File

@ -1,3 +1,4 @@
logs/
target/ target/
pom.xml.tag pom.xml.tag
pom.xml.releaseBackup pom.xml.releaseBackup
@ -11,10 +12,13 @@ buildNumber.properties
.mvn/wrapper/maven-wrapper.jar .mvn/wrapper/maven-wrapper.jar
test/powershell/results/ test/powershell/results/
test/golang/results/
test/python/results/
test/java/results/
docs/source/samples/powershell/test/ docs/source/samples/powershell/test/
docs/source/samples/python/test docs/source/samples/python/test
docs/site/
docs/build docs/build
source/python/client/__pycache__ source/python/client/__pycache__
@ -23,4 +27,3 @@ source/golang/client/src/test
source/golang/client/src/martilq_client* source/golang/client/src/martilq_client*
source/golang/client/src/*.exe source/golang/client/src/*.exe
docs/site/

View File

@ -36,6 +36,9 @@ information
- [powershell](source/powershell/README.md) - [powershell](source/powershell/README.md)
- [docker](source/docker/README.md) - [docker](source/docker/README.md)
See [samples](samples.md) for sample / test cases you can execute. This can help you
understand what can be processed, generated and reconciled.
## Transfer information ## Transfer information
The information in the **martiLQ** document is summarised below. For more detailed The information in the **martiLQ** document is summarised below. For more detailed

View File

@ -8,7 +8,7 @@ dataPath =
[MartiLQ] [MartiLQ]
tags = tags =
publisher = publisher = Meerkat@merebox.com
contactPoint = contactPoint =
accessLevel = accessLevel =
rights = rights =
@ -24,8 +24,12 @@ title = {{documentName}}
state = expired state = expired
expires = m:0:1:0 expires = m:0:1:0
encoding = encoding =
version = version = 1.0
urlPrefix = http://apnedata.merebox.com.s3.ap-southeast-2.amazonaws.com/au/bsb/ urlPrefix =
compression = 7Z
encryption = X.509
describedBy = https://github.com/meerkat-manor/marti/tree/draft_specifications/docs/source/
landingPage = https://github.com/meerkat-manor/marti/
[Hash] [Hash]

View File

@ -22,7 +22,7 @@ theme = Documentation
author = Hive author = Hive
title = {{documentName}} title = {{documentName}}
state = expired state = expired
expires = 2:0:0 expires = t:2:0:0
encoding = UTF-8 encoding = UTF-8
version = 1.0 version = 1.0
urlPrefix = http://localhost/martilq/ urlPrefix = http://localhost/martilq/

View File

@ -0,0 +1,59 @@
[General]
logPath =
tempPath =
dataPath =
[MartiLQ]
tags = sample,bsb
publisher = Meerkat@merebox.com
contactPoint = Meerkat Manor
accessLevel =
rights =
license =
batch =
theme =
[Resources]
author = Meerkat
title = {{documentName}}
state = expired
expires = m:0:1:0
encoding =
version = 1.0
urlPrefix =
compression =
encryption =
describedBy = https://github.com/meerkat-manor/marti/tree/draft_specifications/docs/source/
landingPage =
[Hash]
hashAlgorithm = SHA256
signKey_File =
signKey_Password =
[Network]
proxy =
proxy_User =
proxy_Credential =
[Custom_Spatial]
enabled = false
country =
region =
town =
[Custom_Temporal]
enabled = false
businessDate = {{yesterday}}
runDate = {{today}}

View File

@ -0,0 +1,61 @@
[General]
dateFormat = 2006-01-02
dateTimeFormat = 2006-01-02T15:04:05+0700
logPath =
tempPath =
dataPath =
[MartiLQ]
tags = sample,docs
publisher = Meerkat@merebox.com
contactPoint = Meerkat Manor
accessLevel = Public
rights = Read
license = MIT
batch = 0.1
theme = Sample
[Resources]
author = Meerkat
title = {{documentName}}
state = expired
expires = m:0:6:0
encoding = UTF-8
version = 1.0
urlPrefix =
compression = 7Z
encryption = X.509
describedBy = https://github.com/meerkat-manor/marti/tree/draft_specifications/docs/source/
landingPage = https://github.com/meerkat-manor/marti/
[Hash]
hashAlgorithm = MD5
signKey_File =
signKey_Password =
[Network]
proxy =
username =
password =
[Custom_Spatial]
enabled = false
country =
region =
town =
[Custom_Temporal]
enabled = false
businessDate = {{yesterday}}
runDate = {{today}}

119
samples.md 100644
View File

@ -0,0 +1,119 @@
# Sample execution
A number of samples are provided to demonstrate what the **martiLQ** documents
look like and how simple the exceution can be.
For the BSB (Bank State Branch) samples below, you will first need fetch the files for
lcoal processing. See TBA
## Python
If you have the required Python software and packages installed, and have Internet
then the following commands should generate output for you.
Open a terminal with the current directory set to the project root (here)
```
.\source\python\client\martiLQ.py -t MAKE -s "./docs/source/" -o "./test/python/results/test_proc_docs.json" -c ./docs/source/samples/json/sample_docs.ini -u https://github.com/meerkat-manor/marti/tree/draft_specifications/docs/source/ --filter "w*"
.\source\python\client\martiLQ.py -t GEN -s "./docs/source/samples/python/test/http/" -o "./test/python/results/test_proc_bsb.json" -c ./docs/source/samples/json/sample_bsb.ini -u http://apnedata.merebox.com.s3.ap-southeast-2.amazonaws.com/au/bsb/
```
For details using Python samples see
## Powershell
If you have the required PowerShell software and packages installed, and have Internet
then the following commands should generate output for you.
Open a terminal with the current directory set to the project root (here)
The PowerShell command
```ps1
# This sample will retrieve a number of CKAN files from
# Australian government sites to demonstrate conversion
.\test\powershell\test_MartiLQCkan.ps1
```
For details using PowerShell samples see
## Go
If you have the required GOLANG software and packages installed, and have Internet
then the following commands should generate output for you.
Open a terminal with the current directory set to the project root (here)
A batch (cmd) script
```bat
SET MARTILQ_PROJECT_PATH=%CD%
CD %MARTILQ_PROJECT_PATH%\source\golang\client\src
go run . -- -t GEN -m %MARTILQ_PROJECT_PATH%/test/golang/results/test_proc_bsb.json -c %MARTILQ_PROJECT_PATH%/docs/source/samples/json/sample_docs.ini -s %MARTILQ_PROJECT_PATH%/docs/source --title "DOCS Sample" --description "Directory example for DOCS" --update
go run . -- -t GEN -m %MARTILQ_PROJECT_PATH%/test/golang/results/test_proc_bsb.json -c %MARTILQ_PROJECT_PATH%/docs/source/samples/json/sample_bsb.ini -s %MARTILQ_PROJECT_PATH%/docs/source/samples/python/test/http --title "GEN005" --description "Directory example for BSB with filter" -R --filter "BSBDirectory.*\.csv" --update
cd %MARTILQ_PROJECT_PATH%
```
A PowerShell script to execute
```ps1
$env:MARTILQ_PROJECT_PATH=Get-Location
Set-Location -Path (Join-Path -Path $env:MARTILQ_PROJECT_PATH -ChildPath "source\golang\client\src") -PassThru
$mfile = Join-Path -Path $env:MARTILQ_PROJECT_PATH -ChildPath "test/golang/results/test_proc_docs.json"
$cfile = Join-Path -Path $env:MARTILQ_PROJECT_PATH -ChildPath "docs/source/samples/json/sample_docs.ini"
$spath = Join-Path -Path $env:MARTILQ_PROJECT_PATH -ChildPath "docs/source/"
go run . -- -t MAKE -m $mfile -c $cfile -s $spath --title "DOCS Sample" --description "Directory example for DOCS" --filter "w*" --update
$mfile = Join-Path -Path $env:MARTILQ_PROJECT_PATH -ChildPath "test/golang/results/test_proc_bsb.json"
$cfile = Join-Path -Path $env:MARTILQ_PROJECT_PATH -ChildPath "docs/source/samples/json/GEN005.ini"
$spath = Join-Path -Path $env:MARTILQ_PROJECT_PATH -ChildPath "docs/source/samples/python/test/http/"
go run . -- -t MAKE -m $mfile -c $cfile -s $spath --title "GEN005" --description "Directory example for BSB" --update
Set-Location -Path $env:MARTILQ_PROJECT_PATH -PassThru
```
For details using Go samples see
go run . -- -t GEN -m ./test/test_main_doc_Sample01.json -s ./docs/source/martilq.md --title "GEN001" --description "Simple example with no config"
go run . -- -t GEN -m ./test/test_main_doc_Sample02.json -c ./docs/source/samples/json/GEN002.ini -s ./docs/source/martilq.md --title "GEN002" --description "Simple example"
go run . -- -t GEN -m ./test/test_main_doc_Sample03.json -c ./docs/source/samples/json/GEN002.ini -s ./docs/source/ --title "GEN003" --description "Directory example"
go run . -- -t GEN -m ./test/test_main_doc_Sample04.json -s ./docs/source/ --title "GEN004" --description "Directory example with filter" -R --filter "r.*\.md"
go run . -- -t GEN -m ./test/test_main_doc_Sample05.json -c ./docs/source/samples/json/GEN005.ini -s C.\docs\source\samples\python\test\http\ --title "GEN005" --description "Directory example for BSB with filter" -R --filter "BSBDirectory.*\.csv"
https://github.com/meerkat-manor/marti/blob/draft_specifications/docs/source/martiLQ.md
https://github.com/meerkat-manor/marti/blob/draft_specifications/docs/source/martiLQ.md
SET GO_PROJECT_PATH=
go run . -- -t GEN -m ./test/test_main_doc_Sample05.json -c ./docs/source/samples/json/GEN005.ini -s .\docs\source\samples\python\test\http\ --title "GEN005" --description "Directory example for BSB with filter" -R --filter "BSBDirectory.*\.csv"
.\source\python\client\martiLQ.py -t GET -s "./test/python/results/data" -o "./test/python/results/test_proc_bsb.json"

View File

@ -5,7 +5,6 @@ import (
"os" "os"
"strings" "strings"
"merebox.com/martilq" "merebox.com/martilq"
"time"
"io/ioutil" "io/ioutil"
) )
@ -32,10 +31,6 @@ type Parameters struct {
var params Parameters var params Parameters
// go run . -- -t INIT -c ./test/my_martilq.ini
// go run . -- -t GEN -o ./test/test_martilq_directoryC.json -c ./config/martilq.ini -s ./martilq
// go run . -- -t GEN -o ./test/test_martilq_directoryC.json -c ./config/martilq.ini -s ./martilq --title "Sample run of GEN" --description "@./config/description.txt"
func loadArguments(args []string) { func loadArguments(args []string) {
@ -151,15 +146,6 @@ func loadArguments(args []string) {
} }
} }
if args[ix] == "--landing" {
matched = true
if ix < maxArgs {
ix = ix + 1
params.landing = args[ix]
} else {
panic("Missing parameter for LANDING")
}
}
if !matched && args[ix] != "--" { if !matched && args[ix] != "--" {
fmt.Println("Unrecognised command line argument: " + args[ix]) fmt.Println("Unrecognised command line argument: " + args[ix])
@ -173,8 +159,8 @@ func loadArguments(args []string) {
func printHelp() { func printHelp() {
fmt.Println("") fmt.Println("")
fmt.Println("\t marticli_client ") fmt.Println("\t martilqcli_client ")
fmt.Println("\t =============== ") fmt.Println("\t =======++======== ")
fmt.Println("") fmt.Println("")
fmt.Println("\tThis program is intended as a simple reference implementation") fmt.Println("\tThis program is intended as a simple reference implementation")
fmt.Println("\tin Go of the MartiLQ framework. It is does not provide all") fmt.Println("\tin Go of the MartiLQ framework. It is does not provide all")
@ -186,15 +172,16 @@ func printHelp() {
fmt.Println(" -h or --help : Display this help") fmt.Println(" -h or --help : Display this help")
fmt.Println(" -t or --task : Execute a predefined task which are") fmt.Println(" -t or --task : Execute a predefined task which are")
fmt.Println(" INIT initialise a new configuration file") fmt.Println(" INIT initialise a new configuration file")
fmt.Println(" GEN generate a MartiLQ definition file") fmt.Println(" MAKE make a MartiLQ definition file")
fmt.Println(" GET resources based on MartiLQ definition file")
fmt.Println(" RECON reconicile a MartiLQ definition file") fmt.Println(" RECON reconicile a MartiLQ definition file")
fmt.Println(" -c or --config : Configuration file used by all tasks") fmt.Println(" -c or --config : Configuration file used by all tasks")
fmt.Println(" This is the file written by the INIT task") fmt.Println(" This is the file written by the INIT task")
fmt.Println(" -s or --source : Source directory or file to build MartiLQ definition") fmt.Println(" -s or --source : Source directory or file to build MartiLQ definition")
fmt.Println(" This is used by the GEN and RECON task") fmt.Println(" This is used by the MAKE and RECON task")
fmt.Println(" -m or --martilq : MartiLQ definition file") fmt.Println(" -m or --martilq : MartiLQ definition file")
fmt.Println(" This is used by the GEN and RECON task") fmt.Println(" This is used by the MAKE and RECON task")
fmt.Println(" The GEN task generates the file while") fmt.Println(" The MAKE task makes the file while")
fmt.Println(" RECON task reads the file") fmt.Println(" RECON task reads the file")
fmt.Println(" -o or --output : Output file") fmt.Println(" -o or --output : Output file")
fmt.Println(" This is used by the RECON task") fmt.Println(" This is used by the RECON task")
@ -202,13 +189,16 @@ func printHelp() {
fmt.Println("") fmt.Println("")
fmt.Println(" --title : Title for the MartiLQ. Think of this as") fmt.Println(" --title : Title for the MartiLQ. Think of this as")
fmt.Println(" the job name") fmt.Println(" the job name")
fmt.Println(" This is used by the GEN task") fmt.Println(" This is used by the MAKE task")
fmt.Println(" --description : Description for the MartiLQ. This can be text") fmt.Println(" --description : Description for the MartiLQ. This can be text")
fmt.Println(" or a pointer to a file when the @ prefix is used") fmt.Println(" or a pointer to a file when the @ prefix is used")
fmt.Println(" This is used by the GEN task") fmt.Println(" This is used by the MAKE task")
fmt.Println(" --landing : Landing page for the defintion in the MartiLQ") fmt.Println(" --Update : Update existing definition otherwise fail it exists already")
fmt.Println(" This is best if it is a URL") fmt.Println(" This is used by the MAKE task")
fmt.Println(" This is used by the GEN task") fmt.Println(" --filter : File filter")
fmt.Println(" This is used by the MAKE task")
fmt.Println(" -R or --recursive : Recursively process child folders")
fmt.Println(" This is used by the MAKE task")
fmt.Println("") fmt.Println("")
@ -233,6 +223,11 @@ func main () {
panic("Missing 'config' parameter") panic("Missing 'config' parameter")
} }
_, err := os.Stat(params.configPath)
if err == nil {
panic("MartiLQ configuration file '"+ params.configPath+"' already exists")
}
c := martilq.NewConfiguration() c := martilq.NewConfiguration()
if c.SaveConfig(params.configPath) != true { if c.SaveConfig(params.configPath) != true {
panic("Configuration not saved to: "+ params.configPath) panic("Configuration not saved to: "+ params.configPath)
@ -241,7 +236,7 @@ func main () {
matched = true matched = true
} }
if params.task == "GEN" { if params.task == "MAKE" {
if params.sourcePath == "" { if params.sourcePath == "" {
panic("Missing 'source' parameter") panic("Missing 'source' parameter")
@ -255,23 +250,25 @@ func main () {
panic("MartiLQ document '"+ params.definitionPath+"' already exists and update not specified") panic("MartiLQ document '"+ params.definitionPath+"' already exists and update not specified")
} }
m := martilq.ProcessFilePath(params.configPath, params.sourcePath, params.filter, params.recursive, params.urlPrefix, params.definitionPath ) m := martilq.Make(params.configPath, params.sourcePath, params.filter, params.recursive, params.urlPrefix, params.definitionPath )
if params.title != "" { if params.title != "" {
m.Title = params.title m.Title = params.title
} }
if params.landing != "" {
m.LandingPage = params.landing
}
if params.description != "" { if params.description != "" {
m.Description = params.description m.Description = params.description
} }
m.Modified = time.Now()
m.Save(params.definitionPath) m.Save(params.definitionPath)
fmt.Println("Created MARTILQ definition: " + params.definitionPath) fmt.Println("Created MARTILQ definition: " + params.definitionPath)
matched = true matched = true
} }
if params.task == "GET" {
fmt.Println("ET task not implemented")
matched = true
}
if params.task == "RECON" { if params.task == "RECON" {
_ = martilq.ReconcileFilePath(params.configPath, params.sourcePath, params.recursive, params.definitionPath, params.outputPath ) _ = martilq.ReconcileFilePath(params.configPath, params.sourcePath, params.recursive, params.definitionPath, params.outputPath )

View File

@ -20,6 +20,8 @@ const cEncoding = ""
type configuration struct { type configuration struct {
softwareName string softwareName string
dateFormat string
dateTimeFormat string
logPath string logPath string
tempPath string tempPath string
dataPath string dataPath string
@ -41,6 +43,8 @@ type configuration struct {
version string version string
expires string expires string
encoding string encoding string
compression string
describedBy string
hash bool hash bool
hashAlgorithm string hashAlgorithm string
@ -70,12 +74,17 @@ func NewConfiguration() configuration {
c.softwareName = GetSoftwareName() c.softwareName = GetSoftwareName()
c.dateFormat = "2006-01-02"
c.dateTimeFormat = "2006-01-02T15:04:05-0700"
c.title = "{{documentName}}" c.title = "{{documentName}}"
c.state = "active" c.state = "active"
c.accessLevel = "Confidential" c.accessLevel = "Confidential"
c.rights = "Restricted" c.rights = "Restricted"
c.expires = cExpires c.expires = cExpires
c.encoding = cEncoding c.encoding = cEncoding
c.compression = ""
c.describedBy = ""
c.batchInc = 0.001 c.batchInc = 0.001
c.urlPrefix = "file://" c.urlPrefix = "file://"
@ -143,6 +152,8 @@ func (c *configuration) SaveConfig(ConfigPath string) bool {
cfgini.Section("General").Key("logPath").SetValue (c.logPath) cfgini.Section("General").Key("logPath").SetValue (c.logPath)
cfgini.Section("General").Key("tempPath").SetValue (c.tempPath) cfgini.Section("General").Key("tempPath").SetValue (c.tempPath)
cfgini.Section("General").Key("dataPath").SetValue (c.dataPath) cfgini.Section("General").Key("dataPath").SetValue (c.dataPath)
cfgini.Section("General").Key("dateFormat").SetValue (c.datdateFormataPath)
cfgini.Section("General").Key("dateTimeFormat").SetValue (c.dateTimeFormat)
cfgini.Section("MartiLQ").Key("tags").SetValue(c.tags) cfgini.Section("MartiLQ").Key("tags").SetValue(c.tags)
cfgini.Section("MartiLQ").Key("publisher").SetValue(c.publisher) cfgini.Section("MartiLQ").Key("publisher").SetValue(c.publisher)
@ -158,6 +169,7 @@ func (c *configuration) SaveConfig(ConfigPath string) bool {
cfgini.Section("Resources").Key("state").SetValue (c.state) cfgini.Section("Resources").Key("state").SetValue (c.state)
cfgini.Section("Resources").Key("expires").SetValue (c.expires) cfgini.Section("Resources").Key("expires").SetValue (c.expires)
cfgini.Section("Resources").Key("encoding").SetValue (c.encoding) cfgini.Section("Resources").Key("encoding").SetValue (c.encoding)
cfgini.Section("Resources").Key("compression").SetValue (c.compression)
cfgini.Section("Resources").Key("version").SetValue (c.version) cfgini.Section("Resources").Key("version").SetValue (c.version)
cfgini.Section("Resources").Key("urlPrefix").SetValue (c.urlPrefix) cfgini.Section("Resources").Key("urlPrefix").SetValue (c.urlPrefix)
@ -214,6 +226,8 @@ func (c *configuration) LoadConfig(ConfigPath string) bool {
c.logPath = cfgini.Section("General").Key("logPath").MustString(c.logPath) c.logPath = cfgini.Section("General").Key("logPath").MustString(c.logPath)
c.tempPath = cfgini.Section("General").Key("tempPath").MustString(c.tempPath) c.tempPath = cfgini.Section("General").Key("tempPath").MustString(c.tempPath)
c.dataPath = cfgini.Section("General").Key("dataPath").MustString(c.dataPath) c.dataPath = cfgini.Section("General").Key("dataPath").MustString(c.dataPath)
c.dateFormat = cfgini.Section("General").Key("dateFormat").MustString(c.dateFormat)
c.dateTimeFormat = cfgini.Section("General").Key("dateTimeFormat").MustString(c.dateTimeFormat)
c.tags = cfgini.Section("MartiLQ").Key("tags").MustString(c.tags) c.tags = cfgini.Section("MartiLQ").Key("tags").MustString(c.tags)
c.accessLevel = cfgini.Section("MartiLQ").Key("accessLevel").MustString(c.accessLevel) c.accessLevel = cfgini.Section("MartiLQ").Key("accessLevel").MustString(c.accessLevel)
@ -229,6 +243,7 @@ func (c *configuration) LoadConfig(ConfigPath string) bool {
c.state = cfgini.Section("Resources").Key("state").MustString(c.state) c.state = cfgini.Section("Resources").Key("state").MustString(c.state)
c.expires = cfgini.Section("Resources").Key("expires").MustString(c.expires) c.expires = cfgini.Section("Resources").Key("expires").MustString(c.expires)
c.encoding = cfgini.Section("Resources").Key("encoding").MustString(c.encoding) c.encoding = cfgini.Section("Resources").Key("encoding").MustString(c.encoding)
c.compression = cfgini.Section("Resources").Key("compression").MustString(c.compression)
c.urlPrefix = cfgini.Section("Resources").Key("urlPrefix").MustString(c.urlPrefix) c.urlPrefix = cfgini.Section("Resources").Key("urlPrefix").MustString(c.urlPrefix)
c.hashAlgorithm = cfgini.Section("Hash").Key("hashAlgorithm").MustString(c.hashAlgorithm) c.hashAlgorithm = cfgini.Section("Hash").Key("hashAlgorithm").MustString(c.hashAlgorithm)

View File

@ -27,7 +27,7 @@ type Marti struct {
Uid string `json:"uid"` Uid string `json:"uid"`
Description string `json:"description"` Description string `json:"description"`
Modified time.Time `json:"modified"` Modified string `json:"modified"`
Publisher string `json:"publisher"` Publisher string `json:"publisher"`
ContactPoint string `json:"contactPoint"` ContactPoint string `json:"contactPoint"`
AccessLevel string `json:"accessLevel"` AccessLevel string `json:"accessLevel"`
@ -58,6 +58,7 @@ func NewMarti() Marti {
m.Custom = append(m.Custom, software) m.Custom = append(m.Custom, software)
m.config = NewConfiguration() m.config = NewConfiguration()
m.Modified = time.Now().Format(m.config.dateTimeFormat)
return m return m
} }
@ -90,6 +91,7 @@ func (m *Marti) LoadConfig(ConfigPath string) {
m.Publisher = m.config.publisher m.Publisher = m.config.publisher
m.ContactPoint = m.config.contactPoint m.ContactPoint = m.config.contactPoint
m.AccessLevel = m.config.accessLevel m.AccessLevel = m.config.accessLevel
m.Modified = time.Now().Format(m.config.dateTimeFormat)
m.State = m.config.state m.State = m.config.state
m.Rights = m.config.rights m.Rights = m.config.rights
if m.config.tags != "" { if m.config.tags != "" {
@ -184,7 +186,7 @@ func (m *Marti) Save(pathFile string) bool {
} }
func ProcessFilePath(ConfigPath string, SourcePath string, Filter string, Recursive bool, UrlPrefix string, DefinitionPath string) Marti { func Make(ConfigPath string, SourcePath string, Filter string, Recursive bool, UrlPrefix string, DefinitionPath string) Marti {
m := NewMarti() m := NewMarti()

View File

@ -46,7 +46,7 @@ func TestMarti_DirectoryA(t *testing.T) {
SourcePath := currentDirectory SourcePath := currentDirectory
Recursive := false Recursive := false
DefPath := "../test/test_martilq_directoryA.json" DefPath := "../test/test_martilq_directoryA.json"
ProcessFilePath("", SourcePath, Recursive, DefPath, "") ProcessFilePath("", SourcePath, "", Recursive, DefPath, "")
} }
@ -56,6 +56,6 @@ func TestMarti_DirectoryB(t *testing.T) {
SourcePath := currentDirectory SourcePath := currentDirectory
Recursive := false Recursive := false
DefPath := "../test/test_martilq_directoryB.json" DefPath := "../test/test_martilq_directoryB.json"
ProcessFilePath("../config/martilq.ini", SourcePath, Recursive, DefPath, "") ProcessFilePath("../config/martilq.ini", SourcePath, "", Recursive, DefPath, "")
} }

View File

@ -19,9 +19,9 @@ type Resource struct {
Title string `json:"title"` Title string `json:"title"`
Uid string `json:"uid"` Uid string `json:"uid"`
DocumentName string `json:"documentName"` DocumentName string `json:"documentName"`
IssueDate time.Time `json:"issueDate"` IssueDate string `json:"issueDate"`
Modified time.Time `json:"modified"` Modified string `json:"modified"`
Expires time.Time `json:"expires"` Expires string `json:"expires"`
State string `json:"state"` State string `json:"state"`
Author string `json:"author"` Author string `json:"author"`
Length int64 `json:"length"` Length int64 `json:"length"`
@ -45,11 +45,13 @@ func NewResource(config configuration) Resource {
u := uuid.New() u := uuid.New()
r.Uid = u.String() r.Uid = u.String()
r.IssueDate = time.Now() r.IssueDate = time.Now().Format(config.dateTimeFormat)
r.State = config.state r.State = config.state
r.Author = config.author r.Author = config.author
r.Expires = config.ExpireDate("") r.Expires = config.ExpireDate("").Format(config.dateTimeFormat)
r.Encoding = config.encoding r.Encoding = config.encoding
r.Compression = config.compression
r.DescribedBy = config.describedBy
return r return r
} }
@ -76,11 +78,13 @@ func NewMartiLQResource(config configuration, sourcePath string, urlPath string,
r.State = config.state r.State = config.state
r.Author = config.author r.Author = config.author
r.Expires = config.ExpireDate(sourcePath) r.Expires = config.ExpireDate(sourcePath).Format(config.dateTimeFormat)
if time.Now().Before(r.Expires) && r.State == "expired" { if time.Now().Before(config.ExpireDate(sourcePath)) && r.State == "expired" {
r.State = "active" r.State = "active"
} }
r.Encoding = config.encoding r.Encoding = config.encoding
r.Compression = config.compression
r.DescribedBy = config.describedBy
r.DocumentName = stats.Name() r.DocumentName = stats.Name()
switch config.title { switch config.title {
@ -96,8 +100,8 @@ func NewMartiLQResource(config configuration, sourcePath string, urlPath string,
r.Title = config.title r.Title = config.title
} }
r.IssueDate = time.Now() r.IssueDate = time.Now().Format(config.dateTimeFormat)
r.Modified = stats.ModTime() r.Modified = stats.ModTime().Format(config.dateTimeFormat)
r.Url = urlPath r.Url = urlPath
r.Length = stats.Size() r.Length = stats.Size()
if !excludeHash { if !excludeHash {

View File

@ -18,7 +18,7 @@ public final class Launcher {
MartiLQ m = new MartiLQ(); MartiLQ m = new MartiLQ();
m.title = args[0]; m.title = args[0];
String fileName = "C:/Users/meerkat/source/marti/docs/samples/powershell/test/BSBDirectoryJul21-304.csv"; String fileName = "./docs/samples/powershell/test/BSBDirectoryJul21-304.csv";
Resource re = new Resource(fileName); Resource re = new Resource(fileName);
Attribute at = new Attribute(); Attribute at = new Attribute();
at.category = "cat"; at.category = "cat";

View File

@ -21,7 +21,7 @@ theme =
author = author =
title = documentName title = documentName
state = active state = active
expires = 2:0:0 expires = t:2:0:0
encoding = UTF-8 encoding = UTF-8
version = version =

View File

@ -17,7 +17,6 @@ Param(
$oMarti.license = $oCkan.result.license_id $oMarti.license = $oCkan.result.license_id
$oMarti.description = $oCkan.result.notes $oMarti.description = $oCkan.result.notes
$hashAlgo = "SHA256"
$version = "1.1.0" $version = "1.1.0"
[System.Collections.ArrayList]$lresource = @() [System.Collections.ArrayList]$lresource = @()
@ -31,6 +30,8 @@ Param(
$name = "" $name = ""
} }
$hash = New-MartiHash -Algorithm "SHA256" -FilePath "" -Value $_.hash
$oResource = [PSCustomObject]@{ $oResource = [PSCustomObject]@{
title = $_.name title = $_.name
uid = $_.id uid = $_.id
@ -40,8 +41,7 @@ Param(
state = $_.state state = $_.state
author = $oCkan.result.author author = $oCkan.result.author
length = $_.size length = $_.size
hash = $_.hash hash = $hash
hashAlgo = $hashAlgo
description = $_.description description = $_.description
url = $_.url url = $_.url

View File

@ -79,15 +79,16 @@ function New-MartiDefinition
$lcustom += $oSoftware $lcustom += $oSoftware
[System.Collections.ArrayList]$lresource = @() [System.Collections.ArrayList]$lresource = @()
$oMarti = [PSCustomObject]@{ $oMarti = [PSCustomObject]@{
"content-type" = "application/vnd.martilq.json" "content-type" = "application/vnd.martilq.json"
title = "" title = ""
uid = (New-Guid).ToString() uid = (New-Guid).ToString()
resources = $lresource
description = "" description = ""
issued = Get-Date -f "yyyy-MM-ddTHH:mm:ss"
modified = Get-Date -f "yyyy-MM-ddTHH:mm:ss" modified = Get-Date -f "yyyy-MM-ddTHH:mm:ss"
expires = ""
tags = @( "document", "marti") tags = @( "document", "marti")
publisher = $publisher publisher = $publisher
contactPoint = "" contactPoint = ""
@ -100,6 +101,7 @@ function New-MartiDefinition
landingPage = "" landingPage = ""
theme ="" theme =""
resources = $lresource
custom = $lCustom custom = $lCustom
} }

View File

@ -93,6 +93,7 @@ function New-MartiHash{
$oHash = [PSCustomObject]@{ $oHash = [PSCustomObject]@{
algo = $Algorithm algo = $Algorithm
value = $Value value = $Value
signed = $false
} }
return $oHash return $oHash

View File

@ -9,6 +9,7 @@ import datetime
import getpass import getpass
import hashlib import hashlib
import glob import glob
import argparse
from configparser import ConfigParser from configparser import ConfigParser
import requests import requests
import mimetypes import mimetypes
@ -49,19 +50,45 @@ class martiLQ:
self._oConfiguration = { self._oConfiguration = {
"softwareName": self.GetSoftwareName(), "softwareName": self.GetSoftwareName(),
"author": "Meerkat@merebox.com", "softwareAuthor": "Meerkat@merebox.com",
"version": self._SoftwareVersion, "softwareVersion": self._SoftwareVersion,
"logPath": None, "logPath": "./logs/",
"dateFormat": "2006-01-02",
"dateTimeFormat": "2006-01-02T15:04:05+0100",
"dataPath": "",
"temPath": "",
"state": "active", "tags": None,
"publisher": "",
"contactPoint": "",
"license": "",
"accessLevel": "Confidential", "accessLevel": "Confidential",
"rights": "Restricted", "rights": "Restricted",
"batch": 1.0000,
"batchInc": 0.0001,
"theme": "",
"author": "",
"title": "{{documentName}}",
"state": "active",
"expires": "m:7:0:0",
"version": "1.0",
"urlPrefix": "",
"encoding": "",
"compression": "",
"encryption": "",
"describedBy": "",
"landingPage": "",
"hashAlgorithm": "SHA256", "hashAlgorithm": "SHA256",
"signKey_File": None, "signKey_File": None,
"signKey_Password": None, "signKey_Password": None,
"proxy": None,
"proxy_User": None,
"proxy_Credential": None,
"loaded": False "loaded": False
} }
@ -88,26 +115,57 @@ class martiLQ:
if config_object.has_section("General"): if config_object.has_section("General"):
items = config_object["General"] items = config_object["General"]
if not items is None: if not items is None:
config_attr = ["logPath"] config_attr = ["logPath", "tempPath", "dataPath", "dateFomat", "dateTimeFormat"]
for x in config_attr: for x in config_attr:
if not items[x] is None and not items[x] == "": try:
self._oConfiguration[x] = items[x] if not items[x] is None and not items[x] == "":
self._oConfiguration[x] = items[x]
except Exception:
self.WriteLog("Error in config ignored: " + x)
if config_object.has_section("MartiLQ"):
items = config_object["MartiLQ"]
if not items is None:
config_attr = ["rights", "accessLevel", "tags","publisher","batch","theme", "license","contactPoint"]
for x in config_attr:
try:
if not items[x] is None and not items[x] == "":
self._oConfiguration[x] = items[x]
except Exception:
self.WriteLog("Error in config ignored: " + x)
if config_object.has_section("Resources"): if config_object.has_section("Resources"):
items = config_object["Resources"] items = config_object["Resources"]
if not items is None: if not items is None:
config_attr = ["accessLevel", "rights", "state"] config_attr = ["state", "author", "title", "expires","encoding", "version", "urlPrefix", "compression", "encryption", "describedBy", "landingPage"]
for x in config_attr: for x in config_attr:
if not items[x] is None and not items[x] == "": try:
self._oConfiguration[x] = items[x] if not items[x] is None and not items[x] == "":
self._oConfiguration[x] = items[x]
except Exception:
self.WriteLog("Error in config ignored: " + x)
if config_object.has_section("Hash"): if config_object.has_section("Hash"):
items = config_object["Hash"] items = config_object["Hash"]
if not items is None: if not items is None:
config_attr = ["hashAlgorithm", "signKey_File", "signKey_Password"] config_attr = ["hashAlgorithm", "signKey_File", "signKey_Password"]
for x in config_attr: for x in config_attr:
if not items[x] is None and not items[x] == "": try:
self._oConfiguration[x] = items[x] if not items[x] is None and not items[x] == "":
self._oConfiguration[x] = items[x]
except Exception:
self.WriteLog("Error in config ignored: " + x)
if config_object.has_section("Network"):
items = config_object["Network"]
if not items is None:
config_attr = ["proxy", "proxy_User", "proxy_Credential"]
for x in config_attr:
try:
if not items[x] is None and not items[x] == "":
self._oConfiguration[x] = items[x]
except Exception:
self.WriteLog("Error in config ignored: " + x)
# Now check environmental values # Now check environmental values
self._oConfiguration["signKey_File"] = os.getenv("MARTILQ_SIGNKEY_FILE", self._oConfiguration["signKey_File"]) self._oConfiguration["signKey_File"] = os.getenv("MARTILQ_SIGNKEY_FILE", self._oConfiguration["signKey_File"])
@ -116,9 +174,174 @@ class martiLQ:
self.WriteLog("Configuration load processed") self.WriteLog("Configuration load processed")
def SaveConfig(self, ConfigPath=None):
if not os.path.isfile(ConfigPath):
cfgfile = open(ConfigPath, 'w')
config_object = ConfigParser()
config_object.add_section("General")
config_attr = ["logPath", "tempPath", "dataPath", "dateFomat", "dateTimeFormat"]
for x in config_attr:
try:
if x in self._oConfiguration:
if self._oConfiguration[x] is None:
config_object.set("General", x, "")
elif type(self._oConfiguration[x]) is float or type(self._oConfiguration[x]) is int:
config_object.set("General", x, str(self._oConfiguration[x]))
else:
config_object.set("General", x, self._oConfiguration[x])
except Exception as e:
self.WriteLog("Error in config ignored: " + x + " = " + str(e))
config_object.add_section("MartiLQ")
config_attr = ["rights", "accessLevel", "tags","publisher","batch","theme", "license","contactPoint"]
for x in config_attr:
try:
if x in self._oConfiguration:
if self._oConfiguration[x] is None:
config_object.set("MartiLQ", x, "")
elif type(self._oConfiguration[x]) is float or type(self._oConfiguration[x]) is int:
config_object.set("MartiLQ", x, str(self._oConfiguration[x]))
else:
config_object.set("MartiLQ", x, self._oConfiguration[x])
except Exception as e:
self.WriteLog("Error in config ignored: " + x + " = " + str(e))
config_object.add_section("Resources")
config_attr = ["state", "author", "title", "expires","encoding", "version", "urlPrefix", "compression", "encryption", "describedBy", "landingPage"]
for x in config_attr:
try:
if x in self._oConfiguration:
if self._oConfiguration[x] is None:
config_object.set("Resources", x, "")
elif type(self._oConfiguration[x]) is float or type(self._oConfiguration[x]) is int:
config_object.set("Resources", x, str(self._oConfiguration[x]))
else:
config_object.set("Resources", x, self._oConfiguration[x])
except Exception as e:
self.WriteLog("Error in config ignored: " + x + " = " + str(e))
config_object.add_section("Hash")
config_attr = ["hashAlgorithm", "signKey_File", "signKey_Password"]
for x in config_attr:
try:
if x in self._oConfiguration:
if self._oConfiguration[x] is None:
config_object.set("Hash", x, "")
elif type(self._oConfiguration[x]) is float or type(self._oConfiguration[x]) is int:
config_object.set("Hash", x, str(self._oConfiguration[x]))
else:
config_object.set("Hash", x, self._oConfiguration[x])
except Exception as e:
self.WriteLog("Error in config ignored: " + x + " = " + str(e))
config_object.add_section("Network")
config_attr = ["proxy", "proxy_User", "proxy_Credential"]
for x in config_attr:
try:
if x in self._oConfiguration:
if self._oConfiguration[x] is None:
config_object.set("Network", x, "")
elif type(self._oConfiguration[x]) is float or type(self._oConfiguration[x]) is int:
config_object.set("Network", x, str(self._oConfiguration[x]))
else:
config_object.set("Hash", x, self._oConfiguration[x])
except Exception as e:
self.WriteLog("Error in config ignored: " + x + " = " + str(e))
config_object.add_section("Custom_Spatial")
config_attr = ["enabled", "country", "region", "town"]
for x in config_attr:
config_object.set("Custom_Spatial", x, "")
config_object.add_section("Custom_Temporal")
config_attr = ["enabled", "businessDate", "runDate"]
for x in config_attr:
config_object.set("Custom_Temporal", x, "")
config_object.write(cfgfile)
cfgfile.close()
self.WriteLog("Configuration save processed")
return True
else:
self.WriteLog("Configuration file exists and new not saved")
return False
def ExpireDate(self, sourcePath): # time.Time
expires = datetime.datetime.today()
lExpires = self._oConfiguration["expires"].split(":")
if len(lExpires) != 4 and len(lExpires) != 7:
raise Exception("Expires value '"+ self._oConfiguration["expires"] +"' is invalid")
base = lExpires[0]
if sourcePath == "" or base == "m":
base = "t"
modified = datetime.datetime.today()
if base == "m":
try:
mtime = os.path.getmtime(sourcePath)
except OSError:
mtime = 0
modified = datetime.datetime.fromtimestamp(mtime)
lExpire = [0, 0, 0]
lExpire[0] = int(lExpires[1])
lExpire[1] = int(lExpires[2])
lExpire[2] = int(lExpires[3])
if len(lExpires) > 4:
lExpireD = [0,0,0]
lExpireD[0] = int(lExpires[4])
lExpireD[1] = int(lExpires[5])
lExpireD[2] = int(lExpires[6])
if base == "m":
expires = modified + datetime.timedelta(years=lExpire[0],months=lExpire[1],days=lExpire[2],hours=lExpireD[0], minutes=lExpireD[1], seconds=lExpireD[2])
elif base == "r":
expires = self._oConfiguration.temporal.RunDate + datetime.timedelta(years=lExpire[0],months=lExpire[1],days=lExpire[2],hours=lExpireD[0], minutes=lExpireD[1], seconds=lExpireD[2])
elif base == "b":
expires = self._oConfiguration.temporal.BusinessDate + datetime.timedelta(years=lExpire[0],months=lExpire[1],days=lExpire[2],hours=lExpireD[0], minutes=lExpireD[1], seconds=lExpireD[2])
#elif base == "t":
# fallthrough
else:
expires = datetime.datetime.today() + datetime.timedelta(years=lExpire[0],months=lExpire[1],days=lExpire[2],hours=lExpireD[0], minutes=lExpireD[1], seconds=lExpireD[2])
else:
if base == "m":
expires = modified + datetime.timedelta(years=lExpire[0],months=lExpire[1],days=lExpire[2])
elif base == "r":
expires = self._oConfiguration.temporal.RunDate + datetime.timedelta(years=lExpire[0],months=lExpire[1],days=lExpire[2])
elif base == "b":
expires = self._oConfiguration.temporal.BusinessDate + datetime.timedelta(years=lExpire[0],months=lExpire[1],days=lExpire[2])
#elif base == "t":
# fallthrough
else:
expires = datetime.datetime.today() + datetime.timedelta(days=lExpire[2])
expires = expires.replace(year=expires.year+lExpire[0])
if expires.month+lExpire[1] > 12:
expires = expires.replace(year=expires.year+1)
expires = expires.replace(month=expires.month+lExpire[1]-12)
else:
expires = expires.replace(month=expires.month+lExpire[1])
return expires
def Set(self, MartiLQ): def Set(self, MartiLQ):
self._oMartiDefinition = MartiLQ self._oMartiDefinition = MartiLQ
def SetTitle(self, Title):
self._oMartiDefinition.title = Title
def Get(self): def Get(self):
return self._oMartiDefinition return self._oMartiDefinition
@ -136,15 +359,12 @@ class martiLQ:
self.WriteLog("Parameter: SourcePath Value: {}".format(JsonPath)) self.WriteLog("Parameter: SourcePath Value: {}".format(JsonPath))
self.WriteLog("") self.WriteLog("")
if os.path.exists(JsonPath): if not os.path.exists(JsonPath):
self.WriteLog("Overwriting existing definition") self.WriteLog("martiLQ document file '"+ JsonPath +"' does not exist")
else: raise Exception("martiLQ document file '{}' does not exist".format(JsonPath))
if not os.path.exists(os.path.dirname(JsonPath)):
self.WriteLog("Parent folder does not exist")
raise Exception("Parent folder '{}' does not exist".format(os.path.dirname(JsonPath)))
if not self._oMartiDefinition is None: if not self._oMartiDefinition is None:
self.WriteLog("Existing definition overwritten") self.WriteLog("Existing definition overwritten in memory")
jsonFile = open(JsonPath, "r") jsonFile = open(JsonPath, "r")
self._oMartiDefinition = json.load(jsonFile) self._oMartiDefinition = json.load(jsonFile)
@ -158,10 +378,17 @@ class martiLQ:
def GetConfig(self, Key=None): def GetConfig(self, Key=None):
if not Key is None: try:
return self._oConfiguration[Key] if not Key is None:
else: if Key == "tags":
return self._oConfiguration[Key].split(",")
return self._oConfiguration[Key]
else:
return None
except Exception:
self.WriteLog("Error in getting config: "+ Key)
return None return None
def Close(self): def Close(self):
if self._LogOpen: if self._LogOpen:
@ -229,9 +456,11 @@ class martiLQ:
def NewMartiDefinition(self): def NewMartiDefinition(self):
today = datetime.datetime.today() today = datetime.datetime.today()
dateToday = today.strftime("%Y-%m-%d") dateToday = today.strftime("%Y-%m-%dT%H:%M:%S")
publisher = getpass.getuser() publisher = self.GetConfig("publisher")
if publisher == "":
publisher = getpass.getuser()
lcustom = [] lcustom = []
lcustom.append(self._oSoftware) lcustom.append(self._oSoftware)
@ -246,16 +475,16 @@ class martiLQ:
"description": "", "description": "",
"modified": dateToday, "modified": dateToday,
"publisher": publisher, "publisher": publisher,
"contactPoint": "", "contactPoint": self.GetConfig("contactPoint"),
"accessLevel": self.GetConfig("accessLevel"), "accessLevel": self.GetConfig("accessLevel"),
"rights": self.GetConfig("rights"), "rights": self.GetConfig("rights"),
"tags": [], "tags": self.GetConfig("tags"),
"license": "", "license": self.GetConfig("license"),
"state": "active", "state": self.GetConfig("state"),
"batch": 1.0, "batch": self.GetConfig("batch"),
"describedBy": "", "describedBy": self.GetConfig("describedBy"),
"landingPage": "", "landingPage": self.GetConfig("landingPage"),
"theme": "", "theme": self.GetConfig("theme"),
"resources": lresource, "resources": lresource,
"custom": lcustom "custom": lcustom
@ -266,6 +495,7 @@ class martiLQ:
def Temporal(self): def Temporal(self):
oTemporal = { oTemporal = {
"enabled": False,
"extension": "temporal", "extension": "temporal",
"businessDate": "", "businessDate": "",
"runDate": "" "runDate": ""
@ -276,6 +506,7 @@ class martiLQ:
def Spatial(self): def Spatial(self):
oSpatial = { oSpatial = {
"enabled": False,
"country": "", "country": "",
"region": "", "region": "",
"town": "", "town": "",
@ -285,14 +516,22 @@ class martiLQ:
def NewMartiChildItem(self, SourceFolder, UrlPath=None, Recurse=True, ExtendAttributes=True, ExcludeHash=False, Filter ="*"): def NewMartiChildItem(self, SourceFolder, UrlPath=None, Recurse=True, ExtendAttributes=True, ExcludeHash=False, Filter ="*"):
SourceFullName = os.path.abspath(SourceFolder) if not SourceFolder.endswith("*"):
SourceFullName = os.path.abspath(SourceFolder)
SourceFullName = os.path.join(SourceFullName, Filter)
else:
SourceFullName = os.path.abspath(SourceFolder)
for fullName in glob.iglob(SourceFullName, recursive=Recurse): for fullName in glob.iglob(SourceFullName, recursive=Recurse):
if os.path.isfile(fullName): if os.path.isfile(fullName):
oResource = self.NewMartiLQResource(SourcePath=fullName, UrlPath=UrlPath, ExtendAttributes=ExtendAttributes, ExcludeHash=ExcludeHash) oResource = self.NewMartiLQResource(SourcePath=fullName, UrlPath=UrlPath, ExtendAttributes=ExtendAttributes, ExcludeHash=ExcludeHash)
if self._oMartiDefinition["resources"] is None:
print("MartiLQ defintion resources not initialised")
self.WriteLog("MartiLQ defintion resources not initialised")
self._oMartiDefinition["resources"].append(oResource) self._oMartiDefinition["resources"].append(oResource)
def GetContentType(self, SourcePath): def GetContentType(self, SourcePath):
ext = None ext = None
@ -348,26 +587,33 @@ class martiLQ:
lattribute = self.SetMartiLQResourceAttributes(SourcePath, str(os.path.splitext(SourcePath)[1][1:]).lower(), ExtendAttributes) lattribute = self.SetMartiLQResourceAttributes(SourcePath, str(os.path.splitext(SourcePath)[1][1:]).lower(), ExtendAttributes)
sTitle = self.GetConfig("title")
if sTitle == "{{documentName}}":
sTitle = item.replace(os.path.splitext(SourcePath)[1], "")
if sTitle == "{{documentName.ext}}":
sTitle = item
oResource = { oResource = {
"title": item.replace(os.path.splitext(SourcePath)[1], ""), "title": sTitle,
"uid": str(uuid.uuid4()), "uid": str(uuid.uuid4()),
"documentName": item, "documentName": item,
"issuedDate": dateToday, "issueDate": dateToday,
"modified": last_modified_date, "modified": last_modified_date,
"expires": "", "expires": self.ExpireDate(item).strftime("%Y-%m-%dT%H:%M:%S%z"),
"state": self.GetConfig("state"), "state": self.GetConfig("state"),
"author": self.GetConfig("author"), "author": self.GetConfig("author"),
"length": os.path.getsize(SourcePath), "length": os.path.getsize(SourcePath),
"hash": hash, "hash": hash,
"description": "", "description": "",
"url": "", "url": self.GetConfig("urlPrefix"),
"version": "", "version": self.GetConfig("version"),
"content-type": self.GetContentType(SourcePath), "content-type": self.GetContentType(SourcePath),
"encoding": None, "encoding": self.GetConfig("encoding"),
"compression": None, "compression": self.GetConfig("compression"),
"encryption": None, "encryption": self.GetConfig("encryption"),
"describedBy": self.GetConfig("describedBy"),
"landingPage": self.GetConfig("landingPage"),
"attributes": lattribute "attributes": lattribute
} }
@ -487,6 +733,33 @@ class martiLQ:
return Attributes return Attributes
def NewDefaultAnyAttributes(self, anyFileName):
records = 0
anyFile = open(anyFileName,'r')
while True:
next_line = anyFile.readline()
if not next_line:
break;
records = records + 1
anyFile.close()
lattribute = []
oAttribute = {
"category": "dataset",
"name": "records",
"function": "count",
"comparison": "EQ",
"value": records
}
lattribute.append(oAttribute)
return lattribute
def NewDefaultCsvAttributes(self): def NewDefaultCsvAttributes(self):
@ -679,8 +952,10 @@ class martiLQ:
def SetMartiLQResourceAttributes(self, PathFile, FileType, ExtendedAttributes): def SetMartiLQResourceAttributes(self, PathFile, FileType, ExtendedAttributes):
lattribute = None lattribute = None
matched = False
if FileType == "csv": if FileType == "csv":
matched = True
lattribute = self.NewDefaultCsvAttributes() lattribute = self.NewDefaultCsvAttributes()
if ExtendedAttributes: if ExtendedAttributes:
@ -703,6 +978,7 @@ class martiLQ:
if FileType == "txt": if FileType == "txt":
matched = True
lattribute = self.NewDefaultCsvAttributes() lattribute = self.NewDefaultCsvAttributes()
if ExtendedAttributes: if ExtendedAttributes:
@ -724,19 +1000,25 @@ class martiLQ:
if FileType == "json": if FileType == "json":
matched = True
lattribute = self.NewDefaultJsonAttributes() lattribute = self.NewDefaultJsonAttributes()
if FileType == "zip": if FileType == "zip":
matched = True
lattribute = self.NewDefaultZipAttributes("ZIP") lattribute = self.NewDefaultZipAttributes("ZIP")
if ExtendedAttributes: if ExtendedAttributes:
self.SetAttributeValueNumber(lattribute, "dataset", "files", "count", 0, Comparison="NA") self.SetAttributeValueNumber(lattribute, "dataset", "files", "count", 0, Comparison="NA")
if FileType == "7z": if FileType == "7z":
matched = True
lattribute = self.NewDefaultZipAttributes("7Z") lattribute = self.NewDefaultZipAttributes("7Z")
if ExtendedAttributes: if ExtendedAttributes:
self.SetAttributeValueNumber(lattribute, "dataset", "files", "count", 0, Comparison="NA") self.SetAttributeValueNumber(lattribute, "dataset", "files", "count", 0, Comparison="NA")
if not matched:
lattribute = self.NewDefaultAnyAttributes(PathFile)
if lattribute == None: if lattribute == None:
lattribute = [] lattribute = []
@ -753,12 +1035,13 @@ class martiLQ:
with open(file_local, 'wb') as fl: with open(file_local, 'wb') as fl:
res = ftp.retrbinary(f"RETR {file_remote}", fl.write) res = ftp.retrbinary(f"RETR {file_remote}", fl.write)
if not res.startswith('226 Transfer complete'): if not res.startswith('226 Transfer complete'):
print('Download failed') print('Download failed for: '+file_remote)
self.WriteLog('Download failed for: '+file_remote)
if os.path.isfile(file_local): if os.path.isfile(file_local):
os.remove(file_local) os.remove(file_local)
except ftplib.all_errors as e: except ftplib.all_errors as e:
print('FTP error:', e) self.WriteLog('FTP error:', e)
if os.path.isfile(file_local): if os.path.isfile(file_local):
os.remove(file_local) os.remove(file_local)
@ -784,32 +1067,29 @@ class martiLQ:
if not resource["url"] is None and not resource["url"] == "": if not resource["url"] is None and not resource["url"] == "":
method = str(resource["url"].split(":", 2)[0]).lower() method = str(resource["url"].split(":", 2)[0]).lower()
#print("Method of fetch {} for {}".format(method, resource["url"]))
matched = False
if method == "ftp": if method == "ftp":
matched = True
parts = resource["url"].split("/", 3) parts = resource["url"].split("/", 3)
host = parts[2] host = parts[2]
file_remote = parts[3] file_remote = parts[3]
self.FtpPull(host, file_remote, os.path.join(TargetPath, resource["documentName"])) self.FtpPull(host, file_remote, os.path.join(TargetPath, resource["documentName"]))
fetched_files.append(os.path.join(TargetPath, resource["documentName"])) fetched_files.append(os.path.join(TargetPath, resource["documentName"]))
if method == "http" or method == "https": elif method == "http" or method == "https":
matched = True
response = requests.get(resource["url"]) response = requests.get(resource["url"])
if not response.status_code == 200: if not response.status_code == 200:
self.WriteLog("HTP fetch failed with code {} for '{}'".format(response.status_code, resource["url"])) self.WriteLog("HTTP fetch failed with code {} for '{}'".format(response.status_code, resource["url"]))
print("HTP fetch failed with code {} for '{}'".format(response.status_code, resource["url"])) print("HTTP fetch failed with code {} for '{}'".format(response.status_code, resource["url"]))
fetch_error.append(resource["url"]) fetch_error.append(resource["url"])
else: else:
with open(os.path.join(TargetPath, resource["documentName"]),'wb') as fh: with open(os.path.join(TargetPath, resource["documentName"]),'wb') as fh:
fh.write(response.content) fh.write(response.content)
fetched_files.append(os.path.join(TargetPath, resource["documentName"])) fetched_files.append(os.path.join(TargetPath, resource["documentName"]))
if method == "file": elif method == "file":
pass pass
if not matched: else:
fetch_error.append(resource["documentName"]) fetch_error.append(resource["documentName"])
else: else:
@ -930,3 +1210,127 @@ class martiLQ:
self.WriteLog("TestMartiDefinition function completed with {} errors".format(testError)) self.WriteLog("TestMartiDefinition function completed with {} errors".format(testError))
return oTestResults, testError return oTestResults, testError
def Make(ConfigPath, SourcePath, Filter, Recursive, UrlPrefix, DefinitionPath):
oMarti = martiLQ()
if ConfigPath != "":
oMarti.LoadConfig(ConfigPath)
oMarti.NewMartiDefinition()
oMarti.NewMartiChildItem(SourceFolder=SourcePath, UrlPath=UrlPrefix , ExcludeHash=False, Filter=Filter, Recurse=Recursive, ExtendAttributes=True)
if DefinitionPath != "":
oMarti.Save(DefinitionPath)
return oMarti
def GetResources(ConfigPath, OutputPath, DefinitionPath, Proxy=None, ProxyUser=None,ProxyCredential=None):
oMarti = martiLQ()
if ConfigPath != "":
oMarti.LoadConfig(ConfigPath)
oMarti.Load(DefinitionPath)
oMarti._oConfiguration["proxy"]=Proxy
fetched_files, fetch_error = oMarti.Fetch(OutputPath)
if len(fetch_error) > 0:
print("Fetch file error")
else:
print("Fetched files")
return fetched_files, fetch_error
def main():
parser = argparse.ArgumentParser(description='Processing for MartiLQ')
parser.add_argument("-t", "--task", dest="task", type=str,
choices=["INIT", "MAKE", "GET", "RECON"],
help='task to execute')
parser.add_argument("-s", "--source", dest="sourcePath",
help='path to source documents')
parser.add_argument("-c", "--config", dest="configPath",
help='path to source documents')
parser.add_argument("-m", "--martilq", dest="definitionPath",
help='martiLQ document path')
parser.add_argument("-o", "--output", dest="outputPath",
help="output file path")
parser.add_argument("-u", "--url", dest="urlPrefix",
help="URL prefix for documents")
parser.add_argument("-R", "--recursive", action="store_false",
help="recursive processing for source")
parser.add_argument("--udpate", action="store_false",
help="allow update of existing martiLQ document")
parser.add_argument("--title", dest="title",
help="title for martiLQ document")
parser.add_argument("--filter", dest="filter",
default="*",
help="filter for source documents")
parser.add_argument("--description", dest="description",
help="decription for document")
parser.add_argument("--landing", dest="landing",
help="landing detail for martiLQ document")
args = parser.parse_args()
if args.task == "INIT":
if args.configPath is None or args.configPath == "":
raise Exception("Configuration path parameter required")
m = martiLQ()
m.OpenLog()
if m.SaveConfig(args.configPath):
print("Saved martiLQ configuration: " + args.configPath)
else:
print("Error in saving configuration file")
m.CloseLog()
if args.task == "MAKE":
if args.sourcePath is None or args.sourcePath == "":
raise Exception("Source path parameter required")
if args.definitionPath is None or args.definitionPath == "":
raise Exception("martiLQ document (json) path and name parameter required")
m = Make(ConfigPath=args.configPath, SourcePath=args.sourcePath, Filter=args.filter, Recursive=args.recursive, UrlPrefix=args.urlPrefix, DefinitionPath=args.definitionPath)
if args.title != "":
m.Get()["title"] = args.title
if args.description != "":
m.Get()["description"] = args.description
m.Save(args.definitionPath)
m.CloseLog()
print("Saved martiLQ document: " + args.definitionPath)
if args.task == "GET":
if args.outputPath is None or args.outputPath == "":
raise Exception("Output path parameter required")
if args.definitionPath is None or args.definitionPath == "":
raise Exception("martiLQ document (json) path and name parameter required")
fetched_files, fetch_error = GetResources(ConfigPath=args.configPath, OutputPath=args.outputPath, DefinitionPath=args.definitionPath)
for item in fetched_files:
print("\t"+item)
print("GET Feature done")
if args.task == "RECON":
print("RECON Feature not imlemented yet")
if __name__ == "__main__":
main()

View File

@ -1,15 +1,18 @@
# .\test\powershell\test_MartiLQCkan.ps1 from project root
. .\source\powershell\MartiLQ.ps1 . .\source\powershell\MartiLQ.ps1
. .\source\powershell\MartiLQItem.ps1
. .\source\powershell\ConvertFrom-Ckan.ps1 . .\source\powershell\ConvertFrom-Ckan.ps1
$outFile = ".\test\powershell\results\marti_test_asic.json"
$ckan = Get-Content -Path ".\docs\samples\asic_ckan_api.json" -Raw $ckan = Get-Content -Path ".\docs\source\samples\json\asic_ckan_api.json" -Raw
$oMarti = ConvertFrom-Ckan -InputObject $ckan $oMarti = ConvertFrom-Ckan -InputObject $ckan
$x = ConvertTo-Json -InputObject $oMarti $x = ConvertTo-Json -InputObject $oMarti -Depth 5
Set-Content -Path ".\test\powershell\results\marti_test05.json" -Value $x Set-Content -Path $outFile -Value $x
Write-Host "Wrote converted definition to: $outFile"
$outFile = ".\test\powershell\results\marti_test_covid.json"
$covid_1 = Invoke-WebRequest "https://data.nsw.gov.au/data/api/3/action/package_show?id=793ac07d-a5f4-4851-835c-3f7158c19d15" $covid_1 = Invoke-WebRequest "https://data.nsw.gov.au/data/api/3/action/package_show?id=793ac07d-a5f4-4851-835c-3f7158c19d15"
$oMarti = ConvertFrom-Ckan -InputObject $covid_1 $oMarti = ConvertFrom-Ckan -InputObject $covid_1
$oMarti.description = "This data has been converted from NSW CKAN data source with URL 'https://data.nsw.gov.au/data/api/3/action/package_show?id=793ac07d-a5f4-4851-835c-3f7158c19d15'" $oMarti.description = "This data has been converted from NSW CKAN data source with URL 'https://data.nsw.gov.au/data/api/3/action/package_show?id=793ac07d-a5f4-4851-835c-3f7158c19d15'"
@ -17,11 +20,13 @@ $oMarti.tags += "ckan"
$oMarti.tags += "gov" $oMarti.tags += "gov"
$oMarti.tags += "nsw" $oMarti.tags += "nsw"
$oMarti.publisher = "NSW government (Australia)" $oMarti.publisher = "NSW government (Australia)"
$x = ConvertTo-Json -InputObject $oMarti $x = ConvertTo-Json -InputObject $oMarti -Depth 5
Set-Content -Path ".\test\powershell\results\marti_test06.json" -Value $x Set-Content -Path $outFile -Value $x
Write-Host "Wrote converted definition to: $outFile"
# cases # cases
$outFile = ".\test\powershell\results\marti_test_covidcases.json"
$covid19 = "https://data.nsw.gov.au/data/api/3/action/package_show?id=3dc5dc39-40b4-4ee9-8ec6-2d862a916dcf" $covid19 = "https://data.nsw.gov.au/data/api/3/action/package_show?id=3dc5dc39-40b4-4ee9-8ec6-2d862a916dcf"
$covid_2 = Invoke-WebRequest $covid19 $covid_2 = Invoke-WebRequest $covid19
$oMarti = ConvertFrom-Ckan -InputObject $covid_2 $oMarti = ConvertFrom-Ckan -InputObject $covid_2
@ -30,6 +35,23 @@ $oMarti.tags += "ckan"
$oMarti.tags += "gov" $oMarti.tags += "gov"
$oMarti.tags += "nsw" $oMarti.tags += "nsw"
$oMarti.publisher = "NSW government (Australia)" $oMarti.publisher = "NSW government (Australia)"
$x = ConvertTo-Json -InputObject $oMarti $x = ConvertTo-Json -InputObject $oMarti -Depth 5
Set-Content -Path ".\test\powershell\results\marti_test07.json" -Value $x Set-Content -Path $outFile -Value $x
Write-Host "Wrote converted definition to: $outFile"
# AFSL
$outFile = ".\test\powershell\results\marti_test_afsl.json"
$afsl = "https://data.gov.au/api/3/action/package_show?id=ab7eddce-84df-4098-bc8f-500d0d9776d1"
$afsl_2 = Invoke-WebRequest $afsl
$oMarti = ConvertFrom-Ckan -InputObject $afsl_2
$oMarti.description = "This data has been converted from DATA GOV AU CKAN data source with URL '$afsl'"
$oMarti.tags += "ckan"
$oMarti.tags += "gov"
$oMarti.tags += "au"
$oMarti.publisher = "Australian Securities and Investments Commission (ASIC)"
$x = ConvertTo-Json -InputObject $oMarti -Depth 5
Set-Content -Path $outFile -Value $x
Write-Host "Wrote converted definition to: $outFile"
Write-Host "Execution completed"

View File

@ -10,7 +10,7 @@ from martiLQ import *
os.environ["MARTILQ_LOGPATH"] = "./test/python/results/logs" os.environ["MARTILQ_LOGPATH"] = "./test/python/results/logs"
print("Python test case #1") print("Python sample/test case")
mlq = martiLQ() mlq = martiLQ()
mlq.LoadConfig() mlq.LoadConfig()
@ -19,21 +19,35 @@ mlq.NewMartiChildItem(SourceFolder= "./docs/*", UrlPath="./docs" , ExcludeHash=F
oMarti["description"] = "Sample execution #1" oMarti["description"] = "Sample execution #1"
print("Save martiLQ definition #1") saveFile = "./test/python/results/DocsPlain1.json"
mlq.Save("./test/python/results/DocsPlain1.json") mlq.Save(saveFile)
print("Saved martiLQ document: " + saveFile)
print("Save martiLQ definition #2") saveFile = "./test/python/results/DocsPlain2.json"
oMarti["description"] = "Sample execution #2" oMarti["description"] = "Sample execution #2"
jsonFile = open("./test/python/results/DocsPlain2.json", "w") jsonFile = open(saveFile, "w")
jsonFile.write(json.dumps(oMarti, indent=5)) jsonFile.write(json.dumps(oMarti, indent=5))
jsonFile.close() jsonFile.close()
print("Base sample JSON written: DocsPlain2.json") print("Saved martiLQ document: " + saveFile)
print("Load martiLQ definition #1") saveFile = "./test/python/results/DocsPlain1.json"
mlq.Load("./test/python/results/DocsPlain1.json") print("Load martiLQ document: "+saveFile)
mlq.Load(saveFile)
oMarti = mlq.Get() oMarti = mlq.Get()
print("Definition description is: {}".format(oMarti["description"])) print("Definition description is: {}".format(oMarti["description"]))
mlq.CloseLog() mlq.CloseLog()
print("Completed Python test case #1") configPath = "./docs/source/samples/json/GEN005.ini"
sourcePath = "./docs/source/*"
saveFile = "./test/python/results/test_proc_docs.json"
ProcessFilePath(ConfigPath=configPath, SourcePath=sourcePath, Filter="", Recursive=True, UrlPrefix="https://localhost/", DefinitionPath=saveFile)
print("Saved martiLQ document: " + saveFile)
sourcePath = "./docs/source/samples/python/test/http/*"
saveFile = "./test/python/results/test_proc_bsb.json"
ProcessFilePath(ConfigPath=configPath, SourcePath=sourcePath, Filter="", Recursive=True, UrlPrefix="http://apnedata.merebox.com.s3.ap-southeast-2.amazonaws.com/au/bsb/", DefinitionPath=saveFile)
print("Saved martiLQ document: " + saveFile)
print("Completed Python sample/test cases")