Vous êtes sur la page 1sur 14

Home Big data 18 Free Exploratory Data Analysis Tools For People who don’t code so well

18 Free Exploratory Data Analysis Tools For People who don’t code so
well
BIG DATA BUSINESS ANALYTICS BUSINESS INTELLIGENCE

ANALYTICS VIDHYA CONTENT TEAM , SEPTEMBER 23, 2016 / 21 S H A R E

Introduction

Some of these tools are even better than programming (R, Python, SAS) tools.

All of us are born with special talents. It’s just a matter of time until we discover it and start believing in ourselves. We all
have limitations, but should we stop there? No.

When I started coding in R, I struggled. Sometimes a lot more than one can ever think! Because I had never ever coded
even <Hello World> in my entire life. My situation was similar to a guy who’s didn’t know swimming but was manhandled
into deep ocean, who somehow saved himself from drowning but ended up gulping lot of salty water.

Now when I look back, I laugh at myself. Do you know why? Because, I could have chosen one of several non-coding
tools available for data analysis, and could’ve avoided the suffering.

Data exploration is an inevitable part of predictive modeling. You can’t make predictions unless you know what happened
in the past. The most important skill to master data exploration is ‘curiosity’, which is free of cost yet isn’t owned by
everyone.

I have written this article to help you acknowledge various free tools available for exploratory data analysis. Now a days,
ample of tools are available in the market which are free & quite interesting to work with. These tools doesn’t require you
to code explicitly but simple drag – drop clicks does the job.

List of Non Programming Tools


1. Excel / Spreadsheet
If you are transitioning into data science or have already survived for years, you would know, even after
countless years, excel remains an indispensable part of analytics industry. Even today, most of the
problems faced in analytics projects are solved using this software. With larger than ever community
support, tutorials, free resources, learning this tool has become quite easier.
support, tutorials, free resources, learning this tool has become quite easier.

It supports all the important features like summarizing data, visualizing data, data wrangling etc. which are powerful
enough to inspect data from all possible angles. No matter how many tools you know, excel must feature in your armory.
Though, Microsoft excel is paid but you can still try various other spreadsheet tools like open o ce, google docs, which
are certainly worth a try!

Free Download:

2. Trifacta
Trifacta’s Wrangler tool is challenging the traditional methods of data cleaning and manipulation.
Since, excel possess limitations on data size, this tool has no such boundaries and you can securely
work on big data sets. This tool has incredible features such as chart recommendations, inbuilt
algorithms, analysis insights using which you can generate reports in no time. It’s an intelligent tool
focused on solving business problems faster, thereby allowing us to be more productive at data
related exercises.

Availability of such open source tools make us feel more con dent and supportive, that there are good people also,
around the world who are working extremely hard to make our lives better.

Free Download:

3. Rapid Miner
This tool emerged as a leader in 2016 Gartner Magic Quadrant for Advanced Analytics. Yes, it’s more
than a data cleaning tool. It extends its expertise in building machine learning models. Yes, it comprises
all the ML algorithms which we use frequently. Not just a GUI, it also extends support to people using
Python & R for model building.

It’s continues to fascinate people around the world with its remarkable capabilities. Above all, it claims to provide
analytics experience at lightning fast level. Their product line has several products built for big data, visualizations, model
deployment, some of which (enterprise) include a subscription fee. In short, we can say it’s a complete tool for any
business which requires performing all tasks from data loading to model deployment.

Free Download:

4. Rattle GUI
If you tried using R, but couldn’t get a knack of what’s going in, Rattle should be your rst choice.
This GUI is built on R and gets launched by typing install.packages("rattle") followed by library(rattle)
then rattle() in R. Therefore, to use rattle you must install R. It’s also more than just data mining tool.
Rattle supports various ML algorithms such as Tree, SVM, Boosting, Neural Net, Survival, Linear models etc.

It’s being widely used these days. According to CRAN, rattle is being installed 10000 times every month. It provides
enough options to explore, transform and model data is just few clicks. However, it has fewer options than SPSS for
statistical analysis. But, SPSS is a paid tool.

Free Download:

5. Qlikview
5. Qlikview
Qlikview is one of the most popular tool in business intelligence industry around the world. Deriving
business insights and presenting it in an awesome manner, it what this tool does. With it’s state of art
visualization capabilities, you’d be amazed by the amount of control you get while working on data.
It has an inbuilt recommendation engine to update you from time to time about best visualization
methods while working on data sets.

However, it is not a statistical software. Qlikview is incredible at exploring data, trend, insights but it can’t prove anything
statistically. In that case, you might want to look at other softwares.

Free Download:

6. Weka
An advantage of using Weka is that it is easy to learn. Being a machine learning tool, its interface is
intuitive enough for you to get the job done quickly. It provides options for data pre-processing,
classi cation, regression, clustering, association rules and visualization. Most of the steps you think
of while model building can be achieved using Weka. It’s built on Java.

Primarily, it was designed for research purposes at University of Wakaito, but later it got accepted
by more and more people around the world. However, overtime I haven’t seen an enthusiastic weka community like of R
and Python. The tutorial listed below should help you more.

Free Tutorial:

7. KNIME
Similar to RapidMiner, KNIME o ers an open source analytics platform for analyzing data,
which can later be deployed, scaled using other supportive KNIME products. This tool has
abundance of features on data blending, visualization and advanced machine learning
algorithms. Yes, using this tool you can build models also. Though, there hasn’t be enough talk about this tool, but
considering its state of art design, I think it will soon catch up much needed limelight.

Moreover, quick training lessons are available on their website to get you started with this tool right now.

Free Download:

8. Orange
As cool as its sounds, this tool is designed to produce interactive data visualizations and data
mining tasks. There are enough youtube tutorial to learn this tool. It has an extensive library of data
mining tasks which includes all classi cation, regression, clustering methods. Along with, the
versatile visualizations which get formed during data analysis allows us to understand the data
more closely.

To build any model, you’ll be required to create a owchart. This is interesting as it would help us further understand the
exact procedure of data mining tasks.

Free Download:

9. Tableau Public
9. Tableau Public
Tableau is a data visualization software. We can say, tableau and qlikview are the most
powerful sharks in business intelligence ocean. The of superiority is never ending.
It’s a fast visualization software which let’s you explore data, every observation using various
possible charts. It’s intelligent algorithms figure out by self about the type of data, best method available etc.

If you want to understand data in real time, tableau can get the job done. In a way, tableau imparts a colorful life to data
and let’s us share our work with others.

Free Download:

10. Data Wrapper


It’s a lightning fast visualization software. Next time, when someone in your team gets assigned
BI work, and he/she has no clue what to do, this software is a considerable option. It’s
visualization bucket comprises of line chart, bar chart, column chart, pie chart, stacked bar chart and maps. So, it’s a basic
software and can’t be compared with giants like tableau and qlikview. This tools is browser enabled and doesn’t require
any software installation.

11. Data Science Studio (DSS)


It is a powerful tool designed to connect technology, business and data. It is available in two
segments: Coding & Non-Coding. It’s a complete package for any organization which aims to develop,
build, deploy and scale models on network. DSS is also powerful enough to create smart data
applications to solve real world problems. It comprises of features which facilitates team integration on
projects. Among all features, the most interesting part is, you can reproduce your work in DSS as every action in the
system is versioned through an integrated GIT repository.

Free Download:

12. OpenRefine
It started as Google Re ne but looks like google plummeted this project due to reasons unclear.
However, this tool is still available renamed as Open Re ne. Among the generous list of open source
tools, openre ne specializes in messy data; cleaning, transforming and shaping it for predictive
modeling purposes. As an interesting fact, during model building, 80% time of an analyst is spent in data cleaning. Not so
pleasant, but it’s the fact. Using openrefine, analysts can not only save their time, but put it to use for productive work.

Free Download:

13. Talend
Decision making these days is largely driven by data. Managers & professionals no longer
make gut-based decision. They require a tool which can help them quickly. Talend can help
them to explore data and support their decision making. Precisely, it’s a data collaboration
tool capable of clean, transform and visualize data.

Moreover, it also o ers an interesting automation feature where you can save and redo your previous task on a new data
set. This feature is unique and haven’t been found in many tools. Also, it makes auto discovery, provides smart suggestion
to the user for enhanced data analysis.
to the user for enhanced data analysis.

Free Download:

14. Data Preparator


This tool is built on Java to assist us in data exploration, cleaning and analysis. It includes various inbuilt
packages for discretization, numeration, scaling, attribute selection, missing values, outliers, statistics,
visualization, balancing, sampling, row selection, and several other tasks. It’s GUI is intuitive and simple to
understand. Once you start working on it, I’m sure you wouldn’t take lot of time to figure out how to work.

A unique advantage of this tool is, the data set used for analysis doesn’t get stored in computer memory. This means you
can work on large data sets without having any speed or memory troubles.

Free Download:

15. DataCracker
It’s a data analysis software which specializes on survey data. Many companies do survey but
they struggle to analyze it statistically. Survey data are never clean. It comprises of lot of
missing & inappropriate value. This tool reduces our agony and enhances our experience of working on messy data. This
tool is designed such that it can load data from all major internet survey programs like surveymonkey, survey gizmo etc.
There are several interactive features which helps to understand data better.

Free Download:

16. Data Applied


This powerful interactive tool is designed to build, share, design data analysis reports. Creating
visualization on large data sets can sometimes be troublesome. But this tool is robust in visualizing
large amounts of data using tree maps. Like all other tools above, it has feature for data transformation, statistical analysis,
detecting anomalies etc. All in all, it’s a multi usage data mining tool capable of of automatically extracting valuable
knowledge (signal) from the raw data. You’d be amazed to see that such non-programming tools are no less than R or
Python for data analysis.

Free Download:

17. Tanagra Project


You might not like it because of old fashioned UI, but this free data mining software is
designed to build machine learning models. Tanagra project started as a free software for academic and research
purposes. Being an open source project, it provides you enough space to devise your own algorithm and contribute.

Along with supervised learning algorithms, it is enabled with paradigms such as clustering, factorial analysis, parametric
and nonparametric statistics, association rule, feature selection and construction algorithms etc. Some of its limitations
include unavailability of wide set of data sources, direct access to datawarehouses and databases, data cleansing,
interactive utilization etc.

Free Download:
18. H2o
H2o is one of the most popular software in analytics industry today. In few years, this organization has
succeeded in evangelizing the analytics community around the world. With this open source software,
they bring lighting fast analytics experience, which is further extended using API for programming languages. Not just data
analysis, but you can build advanced machine learning models in no time. The community support is great, hence learning
this tool isn’t a worry. If you live in US, chances are they would be organizing a meetup nearby you. Do drop by!

Free Download:

Bonus Additions:

In addition to the awesome tools above, I also found some more tools which I thought you might be interested to look at.
However, these tools aren’t free but you can still avail them for trial:

1.
2.
3.
4.

End Notes
Once you start working on these tools (your choice), you’d understand that knowing programming for predictive modeling
isn’t much advantageous. You can accomplish the same thing with these open source tools. Therefore, until now, if you
were get disappointed at your lack of non-coding, now is the time you channelize your enthusiasm on these tools. You
may be interested to check .

The only limitation I see with these tools (some of them) is, lack of community support. Except few tools, several of them
don’t have a community to seek help and suggestions. Still, it’s worth a try!

Did you like reading this article? Have you worked on any of the tools listed above? Which one do you think is the most
versatile? Drop your suggestions / opinions in the comments below.

Got expertise in Business Intelligence / Machine Learning / Big Data / Data Science? Showcase your
knowledge and help Analytics Vidhya community by .
Share this:

RELATED
TAGS: DATA VISUALIZATION, H2O, MODEL BUILDING, NON CODING TOOLS , ORANGE, PREDICTIVE MODELING, QLIKVIEW, RAPIDMINER, STATISTICAL ANALYSIS, STATISTICAL MODELING, SURVEY DATA, TABLEAU, TOOLS FOR DATA
ANALYSIS

Next Article
Solutions for Skill test: Data Science in Python

Previous Article
Senior Database Administrator – Bengaluru ( 7-8 Years of Experience )

Author

Analytics Vidhya Content Team


Analytics Vidhya Content team
This is article is quiet old now and you might not get a prompt response from the author. We would request you to post this
comment on Analytics Vidhya Discussion portal to get your queries resolved.

2 1
C O M M E N T S

Sabrina says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 5 : 5 8 A M

Manish, I enjoy your articles a lot, comprehensive list and it would make life so much easier for non coders? Great job!

Analytics Vidhya Content Team says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 8 : 2 3 A M

Good to know! Thanks.

Shantala says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 6 : 1 2 A M

Very Informative. I will try Rattle GUI in R. Thanks for the information.
Keep posting….

Analytics Vidhya Content Team says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 8 : 2 3 A M

Rattle is a good choice.


Good luck.

Sunil Kappal says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 6 : 3 4 A M

Great article, I believe BGML is another one to lookout for as it is picking up pretty good pace with analysts and data
scientists. The awesome thing about this tool is that it lets you download the algorithm as a code which can be used
directly for predictions.

Just thought of sharing it !!!

Hunaidkhan Pathan says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 1 2 : 5 7 P M

Can you provide the URL for BGML , I am not able to find it on internet

Sunil Kappal says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 3 : 2 7 P M

here you go !!!


https://bigml.com/

Hunaidkhan Pathan says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 4 : 1 4 P M
Thanks

Gianni says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 7 : 1 1 A M

Thank you Manish ! Good job.

Herman says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 8 : 2 1 A M

Brilliant article. Thank you.

Mehul Shah says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 9 : 3 0 A M

Thank you Manish for the nice and informative article for individuals like me who are not from coding background.

Pratima Joshi says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 1 0 : 5 6 A M

Hello Manish,
More than any of the above mentioned tools, I found Microsoft Azure ML studio very useful, user-friendly and easy to
learn.
It is free, cloud based and has support for R and Python.

Yvette says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 1 : 4 4 P M

Thank you for the article, Manish. I have Tableau Desktop, but I am always keeping my eye on what tools are available. I
enjoyed the article.

RRD says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 2 : 0 9 P M

The section on DataWrapper is missing a link. Thanks for this excellent article.

Analytics Vidhya Content Team says: REPLY


S E P T E M B E R 2 4 , 2 0 1 6 A T 5 : 3 4 A M

Hi RRD, Datawrapper is a browser enabled tool. Hence, nothing to download.

Anshul solanki says: REPLY


S E P T E M B E R 2 3 , 2 0 1 6 A T 3 : 5 5 P M

I was wondering if we have similar kind of tools to do exploration of data in text format, specifically for NLP related
problems.

Omesaad says: REPLY


S E P T E M B E R 2 4 , 2 0 1 6 A T 9 : 3 1 P M

Thanks Manish
I wonder what about Nvivo and spss ? Are they involved?
Hari Galla says: REPLY
S E P T E M B E R 2 8 , 2 0 1 6 A T 7 : 2 2 A M

Thank you Manish! Lot of information about tools available

Ramdas says: REPLY


S E P T E M B E R 3 0 , 2 0 1 6 A T 9 : 2 8 P M

Thank you manish, very informative article as always, will check out the trifacta tool.

Ramdas

Edmund Laugasson says: REPLY


O C T O B E R 4 , 2 0 1 6 A T 6 : 1 2 A M

Apache OpenOffice is not developed actively but LibreOffice is a good reincarnation and contains already several data
analysis tools – https://help.libreoffice.org/Calc/Data_Statistics_in_Calc

joshua bryant says: REPLY


D E C E M B E R 1 6 , 2 0 1 6 A T 1 1 : 4 0 P M

hi
i think this is a great website

LEAVE A REPLY
Your email address will not be published.

Comment

Name (required)

Email (required)

Website

SUBMIT COMMENT

TOP ANALYTICS VIDHYA USERS

Rank Name Points

1 vopani 8714
1 vopani 8714

2 SRK 8287

3 aayushmnit 7419

4 mark12 6269

5 sonny 5937

More Rankings

POPULAR POSTS

Essentials of Machine Learning Algorithms (with Python and R Codes)


A Complete Tutorial to Learn Data Science with Python from Scratch
25 Open Datasets for Deep Learning Every Data Scientist Must Work With
Understanding Support Vector Machine algorithm from examples (along with code)
Understanding Support Vector Machine algorithm from examples (along with code)
7 Types of Regression Techniques you should know!
6 Easy Steps to Learn Naive Bayes Algorithm (with codes in Python and R)
A Complete Tutorial on Tree Based Modeling from Scratch (in R & Python)
A comprehensive beginner’s guide to create a Time Series Forecast (with Codes in Python)

RECENT POSTS

Highlights of TensorFlow Developer Summit 2018


AISHWARYA SINGH , APRIL 4, 2018

Automatic Image Captioning using Deep Learning (CNN and LSTM) in PyTorch
FAIZAN SHAIKH , APRIL 2, 2018

25 Open Datasets for Deep Learning Every Data Scientist Must Work With
PRANAV DAR , MARCH 29, 2018

AVBytes: AI & ML Developments this week – IBM’s Library 46 Times Faster than TensorFlow, Baidu’s Massive Self-Driving Dataset, the
Technology behind AWS SageMaker, etc.
PRANAV DAR , MARCH 26, 2018
GET CONNECTED

15,284
FOLLOWERS

44,830
FOLLOWERS

2,689
FOLLOWERS

Email
SUBSCRIBE
DATA SCIENTISTS

COMPANIES
JOIN OUR COMMUNITY :
Don't have an account? Sign up here.
 44896 © Copyright 2013-2018 Analytics Vidhya.
 15295
 2691
 5065