Vous êtes sur la page 1sur 17

What is Relational Model

The relational model represents the database as a collection of relations. A relation is nothing but a table
of values. Every row in the table represents a collection of related data values. These rows in the table
denote a real-world entity or relationship.
The table name and column names are helpful to interpret the meaning of values in each row. The data
are represented as a set of relations. In the relational model, data are stored as tables. However, the
physical storage of the data is independent of the way the data are logically organized.
Some popular Relational Database management systems are:
 DB2 and Informix Dynamic Server - IBM
 Oracle and RDB – Oracle
 SQL Server and Access - Microsoft

Relational Model Concepts


1. Attribute: Each column in a Table. Attributes are the properties which define a relation. e.g.,
Student_Rollno, NAME,etc.
2. Tables – In the Relational model the, relations are saved in the table format. It is stored along with
its entities. A table has two properties rows and columns. Rows represent records and columns
represent attributes.
3. Tuple – It is nothing but a single row of a table, which contains a single record.
4. Relation Schema: A relation schema represents the name of the relation with its attributes.
5. Degree: The total number of attributes which in the relation is called the degree of the relation.
6. Cardinality: Total number of rows present in the Table.
7. Column: The column represents the set of values for a specific attribute.
8. Relation instance – Relation instance is a finite set of tuples in the RDBMS system. Relation
instances never have duplicate tuples.
9. Relation key - Every row has one, two or multiple attributes, which is called relation key.
10. Attribute domain – Every attribute has some pre-defined value and scope which is known as
attribute domain

Relational Integrity constraints


Relational Integrity constraints is referred to conditions which must be present for a valid relation. These
integrity constraints are derived from the rules in the mini-world that the database represents.
There are many types of integrity constraints. Constraints on the Relational database management system
is mostly divided into three main categories are:
1. Domain constraints
2. Key constraints
3. Referential integrity constraints
Domain Constraints
Domain constraints can be violated if an attribute value is not appearing in the corresponding domain or
it is not of the appropriate data type.
Domain constraints specify that within each tuple, and the value of each attribute must be unique. This is
specified as data types which include standard data types integers, real numbers, characters, Booleans,
variable length strings, etc.
Example:
Create DOMAIN CustomerName
CHECK (value not NULL)
The example shown demonstrates creating a domain constraint such that CustomerName is not NULL
Key constraints
An attribute that can uniquely identify a tuple in a relation is called the key of the table. The value of the
attribute for different tuples in the relation has to be unique.
Example:
In the given table, CustomerID is a key attribute of Customer Table. It is most likely to have a single key
for one customer, CustomerID =1 is only for the CustomerName =" Google".
CustomerID CustomerName Status
1 Google Active
2 Amazon Active
3 Apple Inactive

Referential integrity constraints


Referential integrity constraints is base on the concept of Foreign Keys. A foreign key is an important
attribute of a relation which should be referred to in other relationships. Referential integrity constraint
state happens where relation refers to a key attribute of a different or same relation. However, that key
element must exist in the table.
Example:

In the above example, we have 2 relations, Customer and Billing.


Tuple for CustomerID =1 is referenced twice in the relation Billing. So we know CustomerName=Google
has billing amount $300
Operations in Relational Model
Four basic update operations performed on relational database model are
Insert, update, delete and select.
 Insert is used to insert data into the relation
 Delete is used to delete tuples from the table.
 Modify allows you to change the values of some attributes in existing tuples.
 Select allows you to choose a specific range of data.
Whenever one of these operations are applied, integrity constraints specified on the relational database
schema must never be violated.
Inset Operation
The insert operation gives values of the attribute for a new tuple which should be inserted into a relation.

Update Operation
You can see that in the below-given relation table CustomerName= 'Apple' is updated from Inactive to
Active.

Delete Operation
To specify deletion, a condition on the attributes of the relation selects the tuple to be deleted.

In the above-given example, CustomerName= "Apple" is deleted from the table.


The Delete operation could violate referential integrity if the tuple which is deleted is referenced by
foreign keys from other tuples in the same database.
Select Operation

In the above-given example, CustomerName="Amazon" is selected


Best Practices for creating a Relational Model
 Data need to be represented as a collection of relations
 Each relation should be depicted clearly in the table
 Rows should contain data about instances of an entity
 Columns must contain data about attributes of the entity
 Cells of the table should hold a single value
 Each column should be given a unique name
 No two rows can be identical
 The values of an attribute should be from the same domain
Advantages of using Relational model
 Simplicity: A relational data model is simpler than the hierarchical and network model.
 Structural Independence: The relational database is only concerned with data and not with a
structure. This can improve the performance of the model.
 Easy to use: The relational model is easy as tables consisting of rows and columns is quite natural
and simple to understand
 Query capability: It makes possible for a high-level query language like SQL to avoid complex
database navigation.
 Data independence: The structure of a database can be changed without having to change any
application.
 Scalable: Regarding a number of records, or rows, and the number of fields, a database should be
enlarged to enhance its usability.
Disadvantages of using Relational model
 Few relational databases have limits on field lengths which can't be exceeded.
 Relational databases can sometimes become complex as the amount of data grows, and the
relations between pieces of data become more complicated.
 Complex relational database systems may lead to isolated databases where the information
cannot be shared from one system to another.
Summary
 The Relational database model represents the database as a collection of relations (tables)
 Attribute, Tables, Tuple, Relation Schema, Degree, Cardinality, Column, Relation instance, are
some important components of Relational Model
 Relational Integrity constraints are referred to conditions which must be present for a valid
relation
 Domain constraints can be violated if an attribute value is not appearing in the corresponding
domain or it is not of the appropriate data type
 Insert, Select, Modify and Delete are operations performed in Relational Model
 The relational database is only concerned with data and not with a structure which can improve
the performance of the model
 Advantages of relational model is simplicity, structural independence, ease of use, query
capability, data independence, scalability.
 Few relational databases have limits on field lengths which can't be exceeded
 Data wrangling, sometimes referred to as data munging, is the process of transforming and
mapping data from one "raw" data form into another format with the intent of making it more
appropriate and valuable for a variety of downstream purposes such as analytics. A data wrangler
is a person who performs these transformation operations.
 This may include further munging, data visualization, data aggregation, training a statistical
model, as well as many other potential uses. Data munging as a process typically follows a set of
general steps which begin with extracting the data in a raw form from the data source, "munging"
the raw data using algorithms (e.g. sorting) or parsing the data into predefined data structures,
and finally depositing the resulting content into a data sink for storage and future use.
Feature integration theory is a psychology theory that describes how a person pieces
together separate features of an object to create a more complete perception of the said object. This
theory especially focuses on the sense of sight and how the eyes absorb information to somehow
“experience” the object one is seeing. Aside from perception, feature integration theory also discusses
the importance of attention in making a correct view of the observed object.
Ad
The development of the feature integration theory is largely credited to Garry Gelade and Anne Treisman,
who co-wrote an academic paper entitled “A Feature-Integration Theory of Attention” in the 1980s. In
the paper, Treisman and Gelade cited several past experiments that revolve around “visual search,” or
the process in which the individual, for example, distinguishes the object’s color and shape apart from
other objects. Some experiments, on the other hand, dealt with “texture segregation” to distinguish the
object from its background, while other experiments explored the person’s ability to spatially locate the
object. In this way, the theory of feature integration suggests that the attributes of a certain object are
processed in sequence, especially in situations where the person needs to notice several features to
correctly distinguish the object. For example, if a person is looking in a crowd for a male friend who has
shoulder-length hair, the first step is to look for people who have shoulder-length hair, and progress into
the friend’s distinguishing characteristics that will single him out.
In general, the feature integration theory describes two primary stages of attention: the pre-attentive and
the focused attention stages. In the first stage of pre-attention, the person instinctively and automatically
focuses on one distinguishing feature of an object, such as its color and orientation. The person does not
really need to make a conscious effort to think in this stage. For example, a person can easily detect a
slanted line among horizontal lines on a piece of paper. In the stage of focused attention, the person takes
all the features of the object and combines all of them to give a correct perception of the object. This is
especially done in situations where the object does not instantly stand out among other objects, such as
a red circle among other circles and squares randomly colored red and blue.
Trainings and practices that apply feature integration theory can help a person improve his skills in
abstract reasoning and attention. They can also help him be more aware and careful of his surroundings.
Teachers can also apply the theory to help students remember their lessons by using a differently-colored
chalk or board marker for important key words.

Examples of a just noticeable difference, or JND, include the detection of change in the volume of
ambient sound, the luminosity of a light in a room, or the weight of a handheld object. The difference
threshold is demonstrated at the moment a change in the nature of such stimuli is detected.
For example, an individual is not likely to notice a slight, gradual increase in the volume of music if the
change in volume remains below the threshold of detection. At a certain point, however, the individual
notices that the volume of the music has increased. The volume at which the increase was noticed
demonstrates the concept of just-noticeable difference. In other words, the threshold of detection has
been surpassed, and the individual is now able to perceive that a change in volume has occurred. Similarly,
a gradual change in the brightness of a light in a room or in the weight of two similar objects held in each
hand remains unnoticed until the change surpasses the threshold at which a difference in luminosity or
weight is perceived.
Just noticeable difference, referred to formally as Weber's Law, is a concept in psychology based on the

findings of Ernst Heinrich Weber, a forerunner in the field of experimental psychology and perception.

APPLYING GESTALT THEORY TO DATA VISUALIZATION


Gestalt theory is made up of several principles —including the concepts of proximity, similarity, closure,

continuation, and figure/ground — that describe how the human brain sees visual information. Designers

who understand this theory can develop visuals that communicate information in the most effective ways.

We discuss each of these principles in turn.

Proximity: When items are placed in close proximity, people assume that they’re in the same group

because they’re close to one another and apart from other groups. The following figure shows a

visualization that includes grouped items.

 Similarity: When items look the same, people perceive them to be of the same type. We naturally

assume that shapes that look the same are related. When you create a data viz and you keep

items together that look the same, you make it easy for someone to understand that those items

represent a group.

 Closure: Our eyes tend to add any missing pieces of a familiar shape. If two sections are taken out

of a circle, as shown in the following figure, people still perceive the whole circle.
Even with pieces missing, you can still tell that this image is a circle.

 Continuation: If people perceive objects as moving in a certain direction, they see them as

continuing to move that way. The figure below shows an example.

Items stay in the same path of movement.

 Figure/ground: Depending on how people look at a picture, they see either the figure

(foreground) or the ground (background) as standing out, as shown in the following figure.

Both the figure and the ground form shapes.


Introduction
At a recent talk I challenged the audience to define several gestalt principles based solely on
representative figures. This "academic" approach to data visualization seems in opposition to a
"pragmatic" approach that focuses on best practices and prior art demonstrated in the growing library of
data visualization books and 2-day seminars.
But let me suggest that gestalt is very much a pragmatic aspect of creating data visualization, in fact a
necessary aspect if you plan to do more than simple bar and line charts (and perhaps even for those simple
charts). This exploration of three of the most simple gestalt principles focuses on how they operate and
how they might act in tandem with and in opposition to each other. I also include some gestures toward
how the gestalt may already be influencing what we think of not as cognitive qualities but as design and
style in data visualization.
Similarity
The most intuitive gestalt principle is that graphical elements with shared visual properties will be
considered in the same group. Here we see the use of color similarity to indicate two classes of elements:
the red ones and the gray ones. This could have also utilized shared symbols (for instance leveraging
d3.svg.symbol or the like) to show shared category; or shared stroke color or width; icons and so on.
Hue and saturation are very bad at denoting quantitative values, but very good at denoting categories.
This basic example seems uncontroversial to the point that it might seem too facile. But while gestalt
principles themselves are important to crafting effective data visualization, I think the gestalt gaze is
equally important. Once we formalize how we are using graphical features to indicate category, quantity,
or topology--even the most fundamental like color similarity--we also notice features that unintentionally
convey meaning.
Some of these unintentional graphical signals are already present in this simple figure: the implied
columns and rows seeming to indicate 8 or 5 other groups; The color red, because of its hue, implies
activation, while the subdued gray implies deactivation; The memory of all circles being initially gray with
only half transitioning to red reinforces this activation signal.
Proximity
A graphical element being close to another graphical element is a strong indication of similarity. The circles
on the right have been split into two groups by simply making the 10 circles on the left closer to each
other than the 30 circles on the right.
We don't typically think that bars in a bar chart are similar simply because they are next to each other,
nor do we assume slices in a pie chart are similar to each other because they are neighbors, but that's
actually what's being conveyed. Clean chart design that groups bars into categories or sorts them by
descending or ascending values works because it aligns the chart to accord with what the reader visually
expects (that things near each other are more similar to each other). In the case of ordering by value, bars
are nearest to the bars that they have similar values with, while categorical ordering groups bars based
on attribute similarity not conveyed in the length of the bar.
One major challenge of deploying more complex data visualization methods, such as force-directed
networks, sankey diagrams, or circle-packing, is that often times with such charts proximity does not mean
similarity. Instead, similarity is graphically denoted with a container or a visible line connecting one
element to another. This spatial problem is difficult to solve, especially with complex datasets, and must
be planned for in deploying any data visualization.
Enclosure
The use of enclosure--surrounding a group of related elements with a visual element--is not a common
technique in data visualization. This is remarkable given how powerful enclosure is. Here we see enclosure
alongside similarity and proximity and yet providing the strongest visual signal of the three.
Enclosure is less common in procedural data visualization because it's hard to compute a clean, effective
border around a group of elements that are being arranged by an algorithm. There are useful techniques,
such as d3.geom.hull for computing a convex hull around a set of points on a plane, but it can be hard to
deploy. Constraint-based graph drawing, like that found in cola.js, accounts for groups in its algorithm,
which allows for more effective use of enclosure in network visualization.
Revelation
This isn't, as far as I'm aware, an actual gestalt principle, but note that the order of graphical transition in
this figure is also a signal. There are implications of causation as well as currency in animated data
visualization which, if unaccounted for, can damage communication. Dynamic data visualization is
powerful not simply because it moves and entertains the short attention spans among us, but because it
communicates prior positions, colors, and relationships. The memory and order of it need to be
thoughtfully deployed.
Conclusion
Accounting for the unintentional values being encoded in the basic settings of our data visualization
graphics is critical. When a reader sees shapes near each other, or a more saturated color, or an animated
transition, and that signal is simply an unintended byproduct of a palette or layout, then that's a failure
on the part of the data visualization creator.
Likewise, limiting our use of graphical signals to the most basic like color similarity reduces our ability to
communicate effectively. In cases where several different categorical distinctions are at play, it limits our
ability to communicate in a sophisticated manner. Effective design and implementation of more complex
data visualization that relies on enclosure (like treemaps and circle packing) and other gestalt principles
not covered in this short essay can only happen if you are aware of the signals those graphics are sending.

Here are several ideas and concepts of interaction design for data visualizations and
interactive websites, using 11 examples from the web.
1. The Basics: Highlighting and Details on Demand
Highlighting and details on demand are interactions that are useful for almost all data visualizations.
Highlighting
The Evolution of the Web is a great example of how highlighting can support the user in focusing on certain
parts of the visualization. The colored, curved web feature lines and the browser lines are emphasized on
mouseover, allowing the user to see when a feature was adopted by the different browsers.

by Hyperakt.Browse more infographics.

Details on Demand
The UEFA EURO 2012 Tournament Map interactive shows a rich popup window when the user clicks on
games, groups, teams or stadiums. Each popup contains detailed information such as an article headline
and quotes for a game. It provides links for further in-depth exploration and gives the user the opportunity
to post to social media.

2. Making More Data Accessible: User-driven Content Selection


A major advantage of interactive visualizations is that the content can be changed by the user. The main
part of such a configurable visualization becomes the template through which different structurally similar
data sets are displayed, and additional controls allow the user to change what data gets displayed. When
used in such a manner, an interactive visualization can make a much larger data set accessible than a
comparable static graphic.
Incremental content changes
In the How Many Households Are Like Yours? interactive by the New York Times, the user can select the
household type by choosing a primary resident and adding or removing additional residents. For each
change, the information shown in the visualization is updated, giving immediate feedback about the
change. This interaction makes data about many different household types accessible without being
overwhelming. A nice detail is the graphical representation of the different residents.
Drill Down
The different expenditure categories in the Where Does My Money Go? interactive are shown as bubbles
in a hierarchical, circular drill-down navigation element. When the user selects a child, sibling, or parent
expenditure category bubble, the navigation element zoomed to the selected bubble. The bubble sizes
represent the expenditure amounts.

Browse more data visualizations.

3. Showing Data in Different Ways: Multiple Coordinated Visualizations


A single graphical representation typically only shows a few dimensions at once and in a particular way.
For example, maps emphasize geographic location and timelines the flow of time. Those commonly used
representations also often have well-known interactions such as pan and zoom for maps. By assembling
multiple standard parts and coordinating them, you can show different aspects of the data set at the same
time.
Content Filtering with Control and Dependent Visualizations
The How Riot Rumours Spread on Twitter visualization by the Guardian uses a line chart as a control for
the content of a bubble chart. The line chart shows the rumour-related tweets per hour over time and
highlights important events. The bubble chart shows the different tweets, their relation to the rumour
and their influence. The user can play the visualization as an animation and directly interact with it. The
interplay between line and bubble chart enables the user to explore interesting points in time, e.g. to see
the proportion between supportive and opposing tweets when the number of rumour-related tweets
started to decline.

Browse more data visualizations.

In CNN’s Home and Away interactive, the user can filter the casualties that are displayed on the home and
away maps by age, location and date, using multiple control bar charts at the bottom of the visualization.
The intersection of the filter settings is applied. Similar to the Riots visualization, this one immediately
updates while the user is dragging the sliders, enabling rapid content exploration (dynamic queries).
Selecting a location also updates the area shown in the home map.

Browse more infographics.

Highlighting and Selection Across Multiple Visualizations


In my visualization of deadly earthquakes, I used synchronized selection and highlighting across a map, a
timeline and a bar chart. Users can explore how earthquakes are distributed geographically and over time,
by their magnitude and by the number of casualties. By selecting one set of earthquakes and highlighting
another one in the bar chart, users can also compare spatial and temporal distribution of these two sets.

Browse more data visualizations.

4. Showing Data in Different Ways: User-driven Visual Mapping Changes


Multiple coordinated visualizations show multiple perspectives on the data at the same time – leaving less
screen real estate available for each visualization. Allowing the user to reconfigure the mappings from
data to visual form (visual mappings) for a fixed visualization type is an alternative that can help in
maximizing the visualization size.
Choosing Pre-Defined Visual Mappings
The Flooding, Power Failures, Rainfall and Damage from Hurrican Irene interactive by the New York Times
provides four different map settings that the user can choose from. When the user selects a different
setting, the data and the way it is projected onto the visualization layer of the map changes. The map,
including the path of the hurricane, and the story next to it provide a constant frame of reference.

User-Driven Visual Mapping Changes for Visual Properties


In the visualization of crowd documentation that I created as part of a research project, the user can select
the data that is displayed for area and for color from a fixed set of data attributes and operations. The
color scale can also be adjusted in a similar way. With this approach, more configurations can be tried out
by the user compared to the pre-defined mappings approach — at the expense having less directed
storytelling and some potentially useless or confusing setting combinations.
Browse more data visualizations.

5. Integrating Users’ Viewpoints and Opinions


Engaging users by allowing them to enter their own viewpoints and opinions into a visualization is the
aspect of interaction that excites me the most. By merging the data display with the user’s personal
viewpoints, the visualization becomes a more central part of the user’s reasoning process on the displayed
topic.
User Viewpoints: Weighting Metrics
The OECD Better Life Index visualization lets users rate the importance of different topics, such as housing,
life satisfaction or education, to create personalized rankings of OECD countries. The measures for each
topic are weighted by the user’s importance rating. The personal importance ratings can be compared to
those of different demographic groups, and the personalized ranking can be shared on social media.
Browse more infographics.

Integrating User’s Projections


The 2012 Election Dashboard by the Huffington Post has a section where the user can project the outcome
for the different states. By clicking on each state multiple times, the user can change his forecast between
Obama, Romney, and undecided.

Vous aimerez peut-être aussi