Commit f13cb1c362ac9da875ddf4976c08997540f38144

Authored by Diego Ceccarelli
2 parents b4e7f95c 4ea4c652

Merge pull request #3 from dav009/master

Updating Readme - thanks to dav009
Showing 1 changed file with 55 additions and 1 deletions   Show diff stats
README.md
... ... @@ -13,7 +13,7 @@ the command will produce a JAR file containing all the dependencies the target f
13 13  
14 14 #### Convert the Wikipedia XML to JSON ####
15 15  
16   - java target/json-wikipedia-1.0.0-jar-with-dependencies.jar it.cnr.isti.hpc.wikipedia.cli.MediawikiToJsonCLI -input wikipedia-dump.xml.bz -output wikipedia-dump.json[.gz] -lang [en|it]
  16 + java -cp target/json-wikipedia-1.0.0-jar-with-dependencies.jar it.cnr.isti.hpc.wikipedia.cli.MediawikiToJsonCLI -input wikipedia-dump.xml.bz -output wikipedia-dump.json[.gz] -lang [en|it]
17 17  
18 18 or
19 19  
... ... @@ -74,6 +74,60 @@ and import the project in your new maven project adding the dependency:
74 74 <artifactId>json-wikipedia</artifactId>
75 75 <version>1.0.0</version>
76 76 </dependency>
  77 +
  78 +#### Schema ####
  79 +
  80 +```
  81 + |-- categories: array (nullable = true)
  82 + | |-- element: struct (containsNull = false)
  83 + | | |-- description: string (nullable = true)
  84 + | | |-- id: string (nullable = true)
  85 + |-- externalLinks: array (nullable = true)
  86 + | |-- element: struct (containsNull = false)
  87 + | | |-- description: string (nullable = true)
  88 + | | |-- id: string (nullable = true)
  89 + |-- highlights: array (nullable = true)
  90 + | |-- element: string (containsNull = false)
  91 + |-- infobox: struct (nullable = true)
  92 + | |-- description: array (nullable = true)
  93 + | | |-- element: string (containsNull = false)
  94 + | |-- name: string (nullable = true)
  95 + |-- integerNamespace: integer (nullable = true)
  96 + |-- lang: string (nullable = true)
  97 + |-- links: array (nullable = true)
  98 + | |-- element: struct (containsNull = false)
  99 + | | |-- description: string (nullable = true)
  100 + | | |-- id: string (nullable = true)
  101 + |-- lists: array (nullable = true)
  102 + | |-- element: array (containsNull = false)
  103 + | | |-- element: string (containsNull = false)
  104 + |-- namespace: string (nullable = true)
  105 + |-- paragraphs: array (nullable = true)
  106 + | |-- element: string (containsNull = false)
  107 + |-- redirect: string (nullable = true)
  108 + |-- sections: array (nullable = true)
  109 + | |-- element: string (containsNull = false)
  110 + |-- tables: array (nullable = true)
  111 + | |-- element: struct (containsNull = false)
  112 + | | |-- name: string (nullable = true)
  113 + | | |-- numCols: integer (nullable = true)
  114 + | | |-- numRows: integer (nullable = true)
  115 + | | |-- table: array (nullable = true)
  116 + | | | |-- element: array (containsNull = false)
  117 + | | | | |-- element: string (containsNull = false)
  118 + |-- templates: array (nullable = true)
  119 + | |-- element: struct (containsNull = false)
  120 + | | |-- description: array (nullable = true)
  121 + | | | |-- element: string (containsNull = false)
  122 + | | |-- name: string (nullable = true)
  123 + |-- templatesSchema: array (nullable = true)
  124 + | |-- element: string (containsNull = false)
  125 + |-- timestamp: string (nullable = true)
  126 + |-- title: string (nullable = true)
  127 + |-- type: string (nullable = true)
  128 + |-- wid: integer (nullable = true)
  129 + |-- wikiTitle: string (nullable = true)
  130 +```
77 131  
78 132 #### Useful Links ####
79 133  
... ...