Vous êtes sur la page 1sur 37

Object Database

System
By
Sudarshan

MCA Sem V
Objects and OIDs

• Data objects can be given an object identifier (OID),


which is some value that is unique in the database
across time.
• An object identifier (OID) is a persistent handle or
name for a particular object.
• DBMS is responsible for generating OIDs and
ensuring that an OID identifies an object uniquely
over its entire lifetime.
• Generally, OIDs are 32 or 64 bit integers that are
managed by the DBMS
• Some systems have all tuples stored in a table as
objects and are automatically assigned unique OID.
• Some systems have a facility given to user to
specify the tables for which the tuples are to be
assigned OIDs.
Objects and OIDs

• An object’s OID cab be used to refer to it from


elsewhere in the data.
• An OID has a type similar to the type of a
pointer in a programming language.
• In SQL:1999 every tuple can be given an OID
by defining the table in terms of a structured
type and declaring that a REF type is
associated with it
– Eg. Theaters table
• REF types have values that are unique
identifiers or OIDs.
• SQL:1999 requires that a given REF type must
be associated with a table.
Notions of Equality

• Two objects having the same type are


defined to be deep equal iff
– The objects are of atomic type and have
same value
– The objects are of reference type and the
deep equals operator is true for the two
referenced objects,
– The objects are of structured type and the
deep equals operator is true for all the
corresponding subparts of the two objects.
• Two objects having the same type are
defined to be shallow equal if both refer
to the same object
Equality Example

ROW(538, t89, 6-3-97, 8-7-97)


ROW(538, t33, 6-3-97, 8-7-97)

Shallow Equals => false

Deep Equals if t89 and t33 refers to


objects of type theater_t that have
same value.
Dereferencing Reference Type

• To access the referenced basetype


item, a built-in deref() method is
provided along with REF type
constructor.
– Nowshowing.deref(theater).name
• SQL:1999 uses a Java-style arrow
operator, combined with dot operator to
access the referenced item.
– Nowshowing.theater->name
URL and OID
• OID are uniquely identify a single object over all
time.
• Web resource pointed at by an URL can change
over time.
• OID are simple identifiers and carry no physical
information about the objects they identify.
• URL include network addresses and often file-
system names.
• OID are automatically generated by the system.
• URL are specified by the users.
• Deletion of OID can be checked by including
REFERENCES ARE CHECKED as a part of the
SCOPE clause and choose one of the actions
Database Design for ORDBMS

• Example
Several space probes continuously records a
video.
A single video stream is associated with each
probe, and while this stream was collected
over a certain time period, we assume that it
is now a complete object associated with the
probe.
During the time period over which the video
was collected, the probe’s location was
periodically recorded.
The information associated with a probe has
three parts
- a probe ID
RDBMS Design
Probes(pid: integer, time: timestamp, lat: real,
long: real, camera: string, video: BLOB)

Have different time, lat, and long values.


Have same pid, camera, and video values.

Functional Dependency: PTLN → CV


P → CV
Needs to be decomposed:
Probes_Loc(pid: integer, time: timestamp, lat:
real, long: real)
Probes_Video(pid: integer, camera: string, video:
BLOB)
Drawbacks of RDBMS Design

• Representing video as BLOB needs the


application code to be written in an external
language to manipulate a video object in the
database.
– For probe 10, display the video recorded between
1:00 pm – 1:10 pm on Sept 22 2006.
• Entire video object associated with probe 10, recorded
over several hours need to be retrieved to display a
segment recorded over 10 minutes.

• Each probe has an associated sequence of


location readings is hidden
• Sequence information is spread across several
tuples.
• Some queries will require a join.
ORDBMS Design

• Store video as an ADT object and write


methods to manipulate it.
• Structured type can be used to store location
sequence.

Probes _AllInfo(pid: integer, locseq:


location_seq,
camera:string, video: mpeg_stream)

ADT: mpeg_stream, with a method display()


that takes a start time and end time and
displays the portion of the video recorded at
Queries

SELECT display(P.video, 1:00 PM Sept 22


2006, 1:10 PM Sept 22 2006)
FROM Probes_AllInfo P
WHERE P.pid = 10
Structured Type

Structured Type: location_seq, defined as a list


type containing a list of ROW type objects.

CREATE TYPE location_seq listof


(ROW(time: timestamp, lat: real, long:
real))

• Query to find the earliest time at which the


given probe was recorded.

SELECT P.pid, MIN(P.locseq.time)


From Probes_AllInfo P
Difference between structured and
reference type.
– my_theater tuple(tno integer, name text, address
text, phone text)
– theater REF(theater_t)
• Deletion
– Objects with reference can be affected by
the deletion of objects that they reference
• eg. Deletion of Theaters table
– Reference free structured objects are not
affected by deletion if other objects.
Difference between structured and
reference type
• Update
– Objects of reference types change value if the
referenced object is updated.
– Objects of reference free structured types change
value only if updated directly.

• Sharing and copying


– An identified object can be referenced by multiple
reference-type items. Thus each update is reflected
in many places.
– An reference free structured types requires
updating all the copies of an object.
Difference between structured and
reference type
• Storage Overhead
– Multiple copies of large values in structured type
objects require much more storage.
– This affects disk usage and buffer management (if
multiple copies are accessed at once)
• Clustering
– The subparts of a structured object are typically
stored together on disk.
– Objects with reference may point to other objects
that are far away in the disk, thus requiring
significant movement of the disk arm.
OID vs Foreign Key

• An OID can point to an object that is


stored anywhere in the database, even
in the field.
• Foreign key reference is constrained to
point to an object in a particular
referenced relation.

• Referential Integrity can be a problem


for OID.
– An object is deleted while there are still oid-
pointers to it.
EER

pid camera
Display(start,end)
Listof(row(time, lat, long))
video

Probes
Using Nested Collections

Can_Teach1(cid: integer, teachers: setof(ssn:


string), sal: integer)
course cid can be thought by the team of
teachers, at a combined cost of sal.

Can_teach2 (cid: integer, teacher_ssn: string,


sal: integer)

course cid can be thought by any of the


teachers in the teachers field, at a cost sal.
ORDBMS Implementation Challenges - 1

Storage and Access Methods


• Storing Large ADT and Structured Type
Objects
– ADT
• Large in size (larger than the single disk page)
• Stored in different location on disk from the tuple and disk
based pointers are maintained.
– Structured Type
• Often vary in size.
• Can grow arbitrarily and hence requires flexible disk
layout mechanisms.
• Array type items needs to be stored in sequence. But
queries may request subarrays that are not stored
continuously, thus requiring high I/O requests.
• To reduce I/O access, arrays are often broken in chunks
and then stored in some order on disk.
ORDBMS Implementation Challenges - 1

• Indexing New Types


– Efficient access can be incorporated by
using index.
– RDBMS supports only equality and range
conditions for the indexing support
– ORDBMS requires efficient indexes for ADT
methods and operators for structured
types.
– One way to make the set of index
structures extensible is to publish an
“access method interface” that lets users
implement an index structure outside the
DBMS.
ORDBMS Implementation Challenges - 1

• Indexing New Types (contd..)


– An alternative is to provide a generic
template index structure.
– The “Generalized Search Tree (GiST)” is
such s structure.
– It is a template index structure based on B+
trees, which allows most of the tree index
structures to be implemented with only a
few lines of user defined ADT code.
ORDBMS Implementation Challenges - 2

Query Processing
• User defined Aggregation function
– New aggregate functions to be defined.
– To register new aggregate function, a user
must implement three methods
• Initialize
• Iterate
• Terminate
ORDBMS Implementation Challenges - 2

• Method Security
– ADT gives the power to add code to the
DBMS
– DBMS must have mechanisms to prevent
buggy or malicious code from causing
problem.
– User methods can be interpreted rather
than complied.
– Allow complied methods but run those
methods in a different address space that
the DBMS. (Use of IPC)
ORDBMS Implementation Challenges - 2

• Method caching
– ADT methods can be expensive to execute
and can account for bulk of time spent on
query execution.
– During query processing it may make sense
to cache the results, in case they can be
used again.
– Within the scope of a single query, one can
avoid calling a method twice on duplicate
values in column by either sorting the table
or using hashing techniques.
– An alternative is to maintain a cache of
ORDBMS Implementation Challenges - 2

• Pointer Swizzling
– In some applications, objects are retrieved into
memory and accessed frequently through their
oids.
– Some system maintain the table of oid that are
currently in memory.
– When an object O is brought into memory, they
check each oid contained in O and replace oids of
in-memory objects by in-memory pointers to those
objects. This technique is called “Pointer
Swizzling” and makes references faster.
– Caution: if an object is paged out, in-memory
reference to it must be invalidated and replaced
with oid.
ORDBMS Implementation Challenges - 3
Query optimization
• Registering Indexes with the optimizer
– The optimizer must be informed about the new
index structures.
– The optimizer must know
• What WHERE-clause conditions are matched by that index
• What is the cost of fetching a tuple for that index
– Optimizer can use any index structure in
constructing a query plan.
• Reduction factor and cost estimation for ADT
methods
– For user defined conditions such as is_sunrise(),
the optimizer needs to estimate reduction factor.
– Users who register the method can also specify the
methods cost as a number, typically in units of the
ORDBMS Implementation Challenges - 3

• Expensive selection optimization.


– RDBMS considers selection as zero time
operation.
– Large objects needs access time and
processing them in memory is complicated.
– ORDBMS optimizers must consider carefully
how to order selection conditions.
Frames.frameno<100 and
is_herbert(Frame.image)
– Best ordering among selections is a
function of their costs and reduction factor.
OODBMS

• OQL:
– OQL is similar to SQL,with a select-from-
where syntax,
– Supports structured type, (arrays, set, bag,
list)
– Allows aggregate operations on structured
type.
• COUNT
– Supports reference type, path expressions,
ADT’s, inheritance, nested queries etc.
– Has a standard Data Definition Language as
ODL.
ODMG Data Model

• ODMG data model is the basis for an OODBMS


• A database is collection of object which are
similar to entities in the ER Model
• Every object has a unique id ,and collection of
object with similar properties is called CLASS.
• The properties of a class are specified using
ODL are of 3 kinds:
– Attribute,
– Relation and
– Methods.
ODMG Data Model
• ATTRIBUTE
– Has a atomic type or structured type
• set, bag, list, array, struct type
• RELATION
– Is either reference to an object or a collection of
reference.
– A relationship captures how an object is related to
one or many object of same class or different
class.
– A relationship in the ODMG model is a binary
relationship in sense of ER Model.
– A relationship has a corresponding inverse
relationship.
• Student belongs_to a Division
• Division has many Students.
• METHODS
ODL Definitions
• The keyword interface is used to define class.
• For each interface we can declare an extent,
which is the name of current set of object of
that class.
Class Collection of
interface Movie Class
(extent Movies key movieName)
{ attribute date start;
attribute date end;
attribute string movieName;
relationship Set<Theater> shownAt inverse
Theater::nowShowing;
}
ODL Definitions

interface Theater
(extent Theaters key theaterName)
{ attribute string theaterName;
attribute string address;
attribute integer ticketPrice;
relationship Set<Movie> nowShowing inverse
Movie:: shownAt;
float numshowing() raises(errorCountingMovies);
}
ODL Definitions

• ODL allows to specify inheritance


hierarchies

interface SpecialShow extends


Movie
(extent SpecialShows)
{
attribute integer
maximumAttendees;
attribute string benefilCharity;
OODBMS/ORDBS Differences

• OODBMS try to add DBMS functionality to a


programming language.
• ORDBMS try to add richer data types to a
relational DBMS.
• An OODBMS is aimed at application where an
object-centric viewpoint is appropriate or user
session consist of retrieving a few object and
working on them.
• An ORDBMS is optimized for application in
which large data collection are the focus.
Object have richer structure and are fairly
large it is expected that application retrieve
data from disk.
OODBMS/ORDBMS difference

• OODBMS aim to achieve integration


with programming lanquauqe such as
C++, Java or Smalltalk.
• Such integration is not important goal of
ORDBMS.
• Query facility of OQL are not supported
efficiently in OODBMS.
• Query facilities are centerpiece of an
ORDBMS.

Vous aimerez peut-être aussi