Académique Documents
Professionnel Documents
Culture Documents
To do Activity Progress
Your access to this course expires on 27 August. Upgrade for unlimited access.
Support
© QUT 2016
To be a big data ninja you need to know many different technologies and
how they interoperate with each other to create platforms that target
your specific use cases. Platforms are built of libraries and frameworks.
Libraries usually provide solutions to specific problems; for instance,
applying neural-network methods on your data. Frameworks integrate
various libraries to provide even more functionality. Here are a few
examples:
Apache Spark: a fast and general compute engine for Hadoop data.
Spark provides a simple and expressive programming model that
supports a wide range of applications, including Extract, Transform and
Load (ETL), machine learning, stream processing, and graph
computation.
3 comments
Mark as
complete
DOWNLOADS
SEE ALSO
APACHE HADOOP
Apache Hadoop is an open-source framework that enables the distributed storage and processing
of very large datasets.
APACHE MAHOUT
APACHE PIG
Apache Pig is a platform for analysing large datasets using MapReduce programs with Hadoop.
APACHE SPARK
CLOUDERA
Observe what industry is doing with the Big Data frameworks and Hadoop ecosystem.
Brief overview of Hadoop HDFS and MapReduce for beginners and pointer to the ‘Dummies’ series
on Big Data.
About Learning with Need some Developing skills Course Small print
FutureLearn FutureLearn help? providers
Career advice Terms
Our story Using our platform FAQ Current and conditions
Workplace learning
partners Privacy policy
Our team An effective way Child safety Healthcare training
to learn Become Cookies
Our values Contact Learning a partner
Learning guide with students Code of conduct
Our learners
Certificates
Our blog
Shop
Jobs
Press
8 2 6 9