Académique Documents
Professionnel Documents
Culture Documents
ON
WikiData Harvesting Data dumps in mongodb and constructing its
knowledge graph
BY
ADITYA MANGLA
(2012A7PS209P)
AT
1|Page
A REPORT
ON
WikiData Harvesting Data dumps in mongodb and constructing its
knowledge graph
BY
ADITYA MANGLA
2012A7PS209P
2|Page
3|Page
ACKNOWLEDGEMENTS
Research opportunities and Industrial exposure are the sole means for students to
understand and appreciate the practical applications of theoretical concepts. The
successful realization of any project is the outcome of a consolidated effort of the team
comprising of the mentors and protege. It is only with their support, guidance,
inspiration and encouragement that any student can achieve his/her goals.
I would have never succeeded in completing my training without the cooperation and
encouragement provided to me by various people. Firstly, my sincere thanks to the
Gnowledge Lab Team, for their help during this internship.
I would like to take this opportunity to express my heartfelt gratitude to my project
mentor Dr. Nagarjuna G. senior scientist and Incharge of the GLab at HBCSE, TIFR for
his constant guidance and overwhelming support. His wisdom, clarity of thought and
persistent encouragement motivated me to not only take up this project but also bring
it to its present state . Working with him has been a great learning experience.
I would like to thank my PS-1 faculty incharge ,Dr Bibhas Sarkar for his constant
support , caring nature and constant guidance at each stage of the internship. Special
thanks to student mentor,Mr Akshay Hoshing for his cordial support and guidance.
This project would not have been possible without the constant guidance and support
of Mr Sunny Chaudhary,Mr Avadoot Nachankar ,Mr. Kedar Aitawdekar and Mr Dhiru
Singh. A special word of thanks to all my fellow research interns at the institute for their
constant support and willingness to discuss and deliberate on all issues.
A special thanks to my colleague and friend Rohan Badlani with whom I did this
project. In the spirit of open source development my gratitude to all those developers
who have contributed to this project and best wishes to all those who will do so in the
future.
Working at Homi Bhabha Centre for Science Education(TIFR) as a Research Intern has
been an enriching experience for me and I would like to express my deep gratitude
towards everyone associated with the Project.I look forward to such golden
opportunities in the future.
4|Page
Duration
Date of start
Date of Submission
Title of Project
Name
ID
PS Faculty
Student Coordinator
8 weeks
23rd May,2014
17th July , 2014
Wikidata project for MetaStudio
Aditya Mangla
2012A7PS209P
Dr Bibhas Ranjan Sarkar
Mr Akshay Hoshing
Key Words
Wikidata , data dumps ,data harvest ,incremental dumps , triples ,N triples,Turtle
triples, JSON , RDF , NDF architecture ,knowledge graph, MetaStudio ,GitHub ,
Python,django , mongoDB , django-mongokit ,D3JS ,Topic , Theme ,Freebase, Yago
database.
Project Areas
Entire project is based on Open Source development. The project is a part of
MetaStudio platform .Areas are as follows :Website Development
-Front end HTML5 , JavaScript , CSS ,D3JS
-Backend Django,Python,mongoDB
Database Handling
-Django-mongokit , mongoDB
5|Page
Algorithms
-Iterative algorithms
-Recursive Algorithms
-Depth First Search algorithm
Abstract :
The aim of the project was to harvest a open source data dump like Wikidata or
Yago data dump into the mongoDB structure being used for MetaStudio.This project
involved harvesting big data from online data stores in a dynamic manner through a
python script.This purpose was to be achieved by running a robust and optimized
python script on the dedicated server.
A log file was supposed to be maintained throughout the running script to keep
track of all errors,exceptions(if any) that might be thrown during the lifetime of the
script.
Subsequently a django app called Wikidata was to be developed that provides a
front end intuitive interface to the user of MetaStudio to access and browse through
the harvested data.
Finally make a knowledge graph from harvested data for easy visualization using
D3JS.
_________________________________
____________________________________
______________________________
PS-1 Cordinator,HBCSE(TIFR)
Project Incharge,HBCSE(TIFR)
Software developer,GLab
BITS Pilani
Mentor Incharge
6|Page
Mentor
7|Page
TABLE OF CONTENTS
ACKOWLEDGMENTS -------------------------------------------------------- ii
ABSTRACT ----------------------------------------------------------------------iv
1.Introduction -----------------------------------------------------------------------------------12
1.1 About the Institute--------------------------------------------------------------------12
1.2 Project MetaStudio-------------------------------------------------------------------13
1.3 MetaStdudio Framework------------------------------------------------------------16
1.3 Motivation behind the Project------------------------------------------------------18
1.3 Aim of the project--------------------------------------------------------------------19
2.Contents----------------------------------------------------------------------------------------20
2.1.Previous work done in data harvesting in MetaStudio--------------------------20
2.2Approach to the Project--------------------------------------------------------------21
2.3Design Steps--------------------------------------------------------------------------24
2.3.1 Choice of data dump----------------------------------------------------------24
2.3.2 Choice of Mapping -----------------------------------------------------------27
2.3.3 Choice of Algorithm -------------------------------------------------------29
2.6Screenshots----------------------------------------------------------------------------------35
2 .6.1 iterative_script running on local host----------------------------------------------35
2.6.2 front end of Wikidata app------------------------------------------------------------37
8|Page
9|Page
TABLE OF FIGURES
10 | P a g e
Figure 13: log files created after the script has run -----------------------------------------------36
Figure 14: Wikidata app on metastudio-------------------------------------------------------------36
Figure 15: Number and name of all objects harvested from script------------------------------37
Figure 16:Hover and click on objects to view its details-----------------------------------------37
Figure 17: Display details of object-----------------------------------------------------------------38
Figure 22 : Location-----------------------------------------------------------------------------------40
11 | P a g e
12 | P a g e
PROJECT MetaStudio
(A web portal for making, sharing and seeking knowledge)
Building the ship while sailing on it
Vision A free open-source platfrom to MAKE, SHARE and SEEK.
13 | P a g e
WHAT IS METASTUDIO ?
This is an initiative of the Homi Bhabha Centre for Science Education, TIFR ,
Mumbai, India for establishing collaboration among students, teachers, researchers
or anyone else interested, to shape education and research in strikingly different
ways. Why is the platform called "metaStudio" is described in the article
metaStudio. However, if you join at this portal as a registered user, it does not imply
that you endorse the idea of studio based education.
a wiki: you can create wiki style pages collaboratively on topics and subjects of
your choice.
ask questions and also respond to questions asked by others (You will get points
for your contributions).
post announcement of events as well as report on them, to tell every one in the
group to keep up to date with experiments,observation,hypothesis and results in
the scientific world online.
create a profile of your own and upload your bio-data almost like a social
networking platform .This ensures transparency and trust in the scientific
community also making the user experience a lot more vibrant and dynamic.
14 | P a g e
Start collaborative research projects on any area of interest under the Creative
Commons license.
All in all it is the vision of the makers of metastudio to make it a complete
package serving the primary purpose of a common platform for scientific
interaction, sharing and learning .But at the same time the path breaking
initiative aims to be different in its approach by including many user friendly
and attractive features to take the experience of a school , college, a science
lab or even a natural observatory and put it up online all for free. This way
science learning will no more be limited by physical barriers of time space
and will reach uptil the grass root levels of humanity bringing people closer
irrespective of the diversity.
When you upload resources (digital documents and software) please ensure that
you are uploading them under the Creative Commons license or other copy left
license or public domain.
Another essential point to remember in the open source community is that even
though all information,code and data is open for access it is ethically and legally
mandatory to cite references and acknowledgements to the source of any open
source information.
15 | P a g e
METASTUDIO FRAMEWORK
The MetaStudio Framework is an NDF (Node Description Framework) where a
generic class called Node describes the basic structure of the objects present in the
website. There exists a dense Object Oriented Architecture.
PURPOSE OF PROJECT
16 | P a g e
Then there is a separate generic class called as Triple. There are 2 classes that
inherit the class Triple namely GRelationand Gattribute.Triple is based on the
concept of defining subject and its associated value. (be it a value of an attribute or a
relation type)
Then there is a separate generic class called as Triple. There are 2 classes that
inherit the class Triple namely GRelationand Gattribute.Triple is based on the
concept of defining subject and its associated value. (be it a value of an attribute or a
relation type)
17 | P a g e
only does the user get to know the meaning of an object but also gets to know
the relative position of that object in the overall scheme of things. As a result
of this display the user can well appreciate the kind of relationships that exist
between objects and various aspects of these relations. These concepts of
graphs between objects is the driving fundamental principle behind many social
networking platforms and even in page ranking algorithms used by search engines
like Google. This forms the very driving principle behind the concept of Semantic
Web envisioned and popularized by Tim Berners-Lee.
18 | P a g e
19 | P a g e
20 | P a g e
Step 1 : Writing a python script to harvest the Big Data available on any of the
data dumps and store it into the structure of mongoDb being used in MetaStudio.
Step 2: Writing a log script to keep track of the exception,errors and the overall
progress of the script. The log file is also a python script and the file is made
dynamically when the harvesting script is run.
Step 4:Develop a knowledge graph based on the harvested data from wikidata
and incorporate that as a display option in the Wikidata app itself.This takes the
entire user experience of the website to a whole new level. As a research and
study oriented topic the knowledge graph also provides a perfect case study for
the areas of ontology,the study of semantic relationships and forms the basis of
the semantic web. The project thus adheres to the fundemantal ideology and core
beliefs on which the entire project MetaStudio is being developed.This is also
what uniquely identifies and highlights the very purpose of this project and this
to be able to visualize the relationship between various objects.
21 | P a g e
22 | P a g e
DESIGN STEPS
1)Choice of Database
2)Choice of Mapping
3)Choice of Algorithm
Choice of Datadump
The most crucial choice before starting project is the choice of datadumps. Now all
datadumps are essentially triples in one form having minor differences in
organization, content, amount of information etc.
A choice had to be made among the 3 most popular datadumps
1)FreeBaseAPI Open Source project founded in 2004 and acquired by Google in
2010. They provide the RDF data in a serialized N-Triples format.
3) Wikidata is a free knowledge base that can be read and edited by humans and
machines alike. It is for data what Wikimedia Commons is for media files: it
centralizes access to and management of structured data, such as interwiki
references and statistical information.
Format:
Q<id_of_topic>Eg: Q1 universe,Q100-Boston
Advantage:
1.Only the list of all topic ids is required. No need to process data in RDF format to
harvest data.
2.The wikidata also provides updated dumps along with statistical information
about the items.
The final choice of data dumps was wikidata due to the following reasons
a) As per our discussion with Prof GN , wikidata is one of the biggest and the most
extensive database and that justifies the choice. Interestingly thats not all !
It was during the course of our groundwork research for the project that we realised
that interestingly unlike Yago the entire database need not be downloaded. So gone
are the days of first downloading 100s of GBs of data on the servers and then
processing the big data.
b)All we need is a file containing a list of all objects like Q2-Earth,Q100-Boston etc.
and a working internet connection and our script is GOOD TO GO !!The system has
just a 400MB file containing object ids.
Here in lies the advantage of Wikidata. All information regarding the triplets that are
basically Relations and Attributes are found in json files which are available online
on the urlsLike
http://www.wikidata.org/wiki/Special:EntityData/Q17.json
24 | P a g e
c)So all we need to do to access the data for an object is access the url dynamically
and start parsing its json by using inbuilt python modules like urllib2,json and
csv(used to parse the text file).
25 | P a g e
Choice of Mapping
Basically there was the obvious need to map the fields of the wikidata json to the
mongodb structure of MetaStudio.
For eg
Aliases(from wikidata json) altnames(of mongodb class Node)
Label(extracted in English from wikidata json) name(of mongodb class Node)
Descriptions(extracted in English from wikidata json) content(of mongodb class
Node)
Q<id> from wikidata json () an attribute called as topic_id(AttributeType called
as topic_id)
Globe-coordinates stored as standard geojson in location(of mongodb class
Node)
All the relations and attributes are present as claims in the wikidata json.The
attributes that could not be harvested directly.A suitable attributeType and then the
attribute was created.
The JSON of Japan Q17
http://www.wikidata.org/wiki/Special:EntityData/Q17.json
26 | P a g e
Choice of Mapping
27 | P a g e
Choice of Algorithm
According to our research the knowledge Graph canbedeveloped from the data
dump using 2 fundamental algorithmic approaches
1)Recursive-Processing each object and its relations and attributes one at a time
almost dynamically (on the fly)
EgRamesh is a student of Bits Pilani.
Time Complexity o(n)
2) Iterative -First creating nodes for all objects. Then processing all relationships
and attributes for each object.
Time Complexity o(n2).Quite simple and intuitive.
For wikidata the recursive algorithm is much more optimized as it works in a DFS
like manner in linear time.
The problem with recursive algorithmThe recursive algorithm works in a Depth First Search (DFS) like manner.It keeps
going deeper and deeper into the tree until it bottoms out/reaches the tree and then
starts returning.
28 | P a g e
Week 1( 23rd May,2014 to 29th May,2014) - The first 2 days were primarily used
for orientation programmes.This included an enriching talk about the organization
in general(HBCSE,TIFR) and the GLab in particular which is an open source lab
established on the principles of open source development laid down by notable
names like Richard Stallman, Linus Torvalds etc. The project topics and the required
skill sets required were also discussed and in depth discussion took place.
In the following week there were sessions held by the PS-1 instructor,Dr Bibhas
Sarkar and student mentor,Mr Akshay Hoshing to enlighten us about the guidelines
regarding the PS-1 program.
In the first week itself cordial discussions were organised with our peers regarding
every project.The details of every project were deliberated on at length with our
mentor Dr Nagarjuna.The required skill set of every project was also discussed at
length.Most importantly this was the period when most of us were beginning our
journey into the vast and amazing world of open source development.The ubuntu
operating system was supposed to be installed by all .In words of our extremely
accommodating mentor the choice of project was to be done based on the level of
interest and enthusiasm and everybody's opinion was accommodated , queries
solved and only then was the allocation process completed. Some concepts that
were clarified were
1) All students at HBCSE(TIFR) fro PS-1 would be working on an open source
project , which is basically a science learning platform following NDF
architecture.
2) The common skill sets for almost all projects included Python programming,
django framework,mongodb NoSQL database, javascript (for front end work if
any).
3) All projects were added features or functionalities to the metastudio
framework which was live even at that time.So it was quite motivating to
know that all our work woul finally contribute to a real life , live project which
will be up on the internet for everybody to see.
30 | P a g e
Week 2(30th May,2014 6th June,2014)--The entire batch of interns started off
work by understanding the fundamental aspects and principles of the open source
project MetaStudio.Some essential and common skills were supposed to be acquired
before any real contribution to the code could happen.So we started our work in
primarily mastering the following concepts and skills1) Python language - an open source, object oriented , high level language
developed by Guido Von Rossum.I reffered to multiple sources like books,official
documentation,blogs etc to acquire a working knowledge of the language.
2)Django framework -a robust and popular open source web framework written in
the python language.The entire MetaStudio is built using the django framework and
I followed to django official documents to learn how to develop apps in this
framework.This was a relatively challenging but immensely satisfying task.The
learning of django framework involved making apps and trying out running server
on the local host etc. This part of the project led to greater learning of backend web
development , interaction with database, administrator management and query
optimization.
3)mongodb - This aspect deserves special attention as this is one of the most recent
developments in the world of modern day computer science namely NoSQL
databases.The concept, though covered briefly in class is still quite challenging and
interesting.I got to pursue my academic interest in databases by covering this
hitherto new area of NoSQL data stores.There are various NnoSQL datastores
available for usage like Cassandra,Mongodb etc .The database being used in
MetaStudio id MongoDB and this proved to be a very enriching and valuable
learning of the PS.It should be noted that NoSQl databases are schema less hence
very flexible and query execution on these is highly optimized and dynamic.
primarily I focussed on understanding the use case of the project and the data flow
diagram . It was this stage in the words of Prof GN that we could truly say "Open source development is like Building a ship while sailing on it
This also contributed to a sense of fulfilment of working not a theoretical or
redundant project but on a live, growing and latest project led by such a wonderfully
creative, supportive and forever helping mentor.
Week 3(7th June,2014 13th June,2014) - The momentum had now been
gathered and all of us were now confident of the true extent and all requirements
ofd our respective projects.It was now that we faced another unexpected challenge
called the GIT.
Developed by the legendary programmer and the 'messiah' of open source
movement Linus Torvalds the git is what is essentially called as a VCS(Version
Control System).It basically enables multiple contributors/developers to contribute
simultaneously to an ongoing project from their own location without any physical
contact whatsoever.The interesting part is that each change , however small and
insignificant it may seem is always stored as a version ( think of it like a snap at a
moment of time) and as the changes keep coming the author or the administartor of
the project may ( at any instant of time) revert back to an older version.This way the
system becomes foolproof.No code is ever lost as each change is recorded as a
version.It also maintains records of contributions by each developer hence giving a
true picture and almost the entire life story of any project.It has several variations
and various other features. We worked on the GitHub and it is at this stage we
learnt its terminologies like - repository, clone , fork,push,pull etc
33 | P a g e
Week 7(4th July,2014-10th July,2014)PHASE 4 Now the ground work was all as
all basic portions had been coded and tested.It towards this time that our mentor
pointed out some gaping flaws in the system design and so the code needed
restructuring . The script final_script2 gave way to the final script called as
iterative_script.py with a finally working algorithm.Server acces to a public IP was
given and script was run on that server.
Week 8(4th July,2014-10th July,2014) The front end of the app was improved by adding features like tag based navigation
as in Wikipedia , location mapping based on globe coordinates and most importantly
the concept graph ( a kind of knowledge graph)
Reports,presentations and documentations were prepared and pull request was
sent to the mentors.This marked a successful end to the project and to PS-1 program
as well.
34 | P a g e
SCREEN SHOTS
(I) The Python script itearive_script.py running on local machine
35 | P a g e
36 | P a g e
38 | P a g e
39 | P a g e
Knowledge map
Fig 20 Location
40 | P a g e
41 | P a g e
CONCLUSION
The aim of the project was to write a python script to harvest big data dumps from
an open source data document collection like Wikidata or Yago database and store
it into the MongoDB structure being used for MetaStudio project.
The next part of the project was to code an log script in python to maintain a
record of all exceptions,errors or warnings that are being encountered dynamically
when the script runs. All such messages are being written into a text file.
Then from the harvested data a django app called as Wikidatawas developed to
display the harvestd BigData in an intuitive and comprehensive manner.
This includes navigation through tags and displaying a knowledge graph in which
every item(GSystem or GSystemType is a node and the links between nodes are the
relationships that exist between these items)
The algorithm being used is a modified custom made one which is primarily
iterative in nature but contains functions within which are iterative and progress
in a depth first search like manner.
The skillset developed during the project includes python, django web
framework,mongodb, front end web development(html, css, javascript) and
working on the git. Most importantly the project inculcated the some essential life
skills like teamwork, confidence, comprehensive articulation both in writing and
speaking.
Most importantly the experience of PS-1 gave me an opportunity to explore the
application of classroom knowledge to real-life live projects be it in the areas of
database management systems, query optimization, data structures and
algorithms, data mining or even operating systems(OS).This gave me an
opportunity to work in a leading centre for computer science research, understand
their pedagogy and work culture, meet leading researchers and scientists, work on
state of the art platforms and explore cutting edge technology. Last but not the least
it gave us an opportunity to explore my own interests in the field of Computer
science and indicated our weaknesses and strengths. This has surely proved to be
much more than an internship and will definitely be cherished fondly as a life
experience.
44 | P a g e
FUTURE SCOPE
It is a defining characteristic of any good and successful open source project that it
must clearly identify its flaws and limitations and clearly lay out a vision for the next
programmer/contributor so that the voyage continues and the platform keeps
developing. I have tried to do the same.
1)Need for greater Integration
A tag cloud like feature or other D3JS data representation schemes like bar
graphs etc. may be integrated into the front end to lead to better
understanding of the data. . Its human tendency that humans respond better
and faster to visual representations of data so the whole user experience will
be taken to a whole new level if the harvested data (Big Data) is displayed in
an intuitive fashion.
A search bar has been provided by us in the UI and the search teams work
should be integrated into it so as to make accessing objects easier and faster.
Both these things lay outside the scope of our project.
3)A possible extension is to create a UI based on filters and options in which the
user can select options to see certain pages of wikidata and explore in a well
thought out manner.
Theme1
Topic 1, Topic2, , Topic N
Theme2
Topic1, Topic2, , Topic M
45 | P a g e
BIBLIOGRAPHY
1) Beginning Python , from novice to professional by Magnus Lie Hetland
2) www.djangoproject.com the official documentation of django
https://docs.djangoproject.com/en/1.6/intro/tutorial01
https://docs.djangoproject.com/en/1.6/intro/tutorial02
https://docs.djangoproject.com/en/1.6/intro/tutorial03
https://docs.djangoproject.com/en/1.6/intro/tutorial04
https://docs.djangoproject.com/en/1.6/intro/tutorial05
https://docs.djangoproject.com/en/1.6/intro/tutorial06
3) Python video tutorials by New Boston
4) www.freebase.com Freebase API and official documentation
5) www.wikidata.org/wii/wikidata:main_page
6) dumps.wikimedia.org
7) www.tutorialspoint.com for python and mongodb
8) www.github.com GitHub
9) www.d3js.org official page for Data driven documents(D3).A powerful
javascript library to create knowledge graphs and other visual representations
of data
10) https://bost.ocks.org/mike/ - tutorials and samples for d3js codes
11) https://github.com/peterbe/django-mongokit - open source documentation
of django-mongokit
46 | P a g e
REFERENCES
1) Singhal, Amit (May 16, 2012). "Introducing the Knowledge Graph: Things,
Not Strings". Official Blog (of Google). Retrieved May 18, 2012.
2) Waters, Richard (May 16, 2012). "Google To Unveil Search Results
Overhaul". Financial Times. Retrieved May 16, 2012.
3)http://en.wikipedia.org/wiki/Turtle_(syntax)
4)http://en.wikipedia.org/wiki/N-Triples
5) http://en.wikipedia.org/wiki/Semantic_Web
6)http://www.wikidata.org/wiki/Special:EntityData/Q17.json
47 | P a g e
GLOSSARY
Data Dumps : Downloadable versions of Big data available in a specific format
maintained as different versions according to updates in the data. These are
available from various sources in various formats like Yago database, Freebase and
Wikidata etc.The choice of data dump is a crucial decision that had to be taken
before starting the coding part.
Knowledge Graph : a semantic methodology to structure the data as a collection of
nodes joined by links ( that are relationships). It solves 2 purposes . Knowledge
Graph is a knowledge base used by Google to enhance its search engine's search
results with semantic-search information gathered from a wide variety of sources.
Knowledge Graph display was added to Google's search engine in 2012, starting in
the United States, having been announced on May 16, 2012.[1] It provides
structured and detailed information about the topic in addition to a list of links to
other sites. The goal is that users would be able to use this information to resolve
their query without having to navigate to other sites and assemble the information
themselves.[2]
48 | P a g e
N Triples : N-Triples[4] is a format for storing and transmitting data. It is a linebased, plain text serialisation format for RDF (Resource Description Framework)
graphs, and a subset of the Turtle (Terse RDF Triple Language) format.
RDF triples: Encoded as an RDF triple, the subject and predicate would have to be
resources named by URIs. The object could be a resource/url or literal element. For
example, in the N-Triples form of RDF, the statement might look like:
<http://www.w3.org/People/EM/contact#me>
<http://www.w3.org/2000/10/swap/pim/contact#fullName> "Eric Miller" .
<http://www.w3.org/People/EM/contact#me>
<http://www.w3.org/2000/10/swap/pim/contact#mailbox> <mailto:em@w3.org> .
<http://www.w3.org/People/EM/contact#me>
<http://www.w3.org/2000/10/swap/pim/contact#personalTitle> "Dr." .
<http://www.w3.org/People/EM/contact#me> <http://www.w3.org/1999/02/22-rdfsyntax-ns#type> <http://www.w3.org/2000/10/swap/pim/contact#Person>
49 | P a g e
Semantic Web: The term was coined by Tim Berners-Lee for a web of data that can
be processed by machines. The internet ever since its conception has been a
network of computers comprising primarily of human understandable
data,structures and relations but with the multitudes of data in the recent times we
have progressed onto times of bot users and crawlers.Search engines help to
present the desired data from the vast storehouses of big data available online and
this task has been implemented using a concept of Semantic Web.It aims to create a
web network of data not only understandable by humans but also machines.This
way the crawlers,bots and search engines will become a lot faster,efficient and
useful hence reducing human time wastage and effort.This has been a growing area
in computer science and it borrows ideas from the field of machine learning,data
mining and most importantly Artificial Intelligence(AI)
Turtle Triple:Turtle[3] was defined by Dave Beckett as a subset of Tim BernersLee and Dan Connolly's Notation3 (N3) language, and a superset of the minimal NTriples format.
50 | P a g e