Vous êtes sur la page 1sur 51

WATERFALL MODEL

One such approach used in software development is 'The Waterfall Model'. Waterfall approach was first a 'Process Model' to be introduced and followed widely in software engineering to ensure success of the project. In the waterfall approach, the whole process of software development is divided into separate phases. These phases in the model are: a. b. c. d. Requirement specifications phase Software design Implementation Testing and maintenance

All these phases are cascaded to each other so that the second phase is started as and when a defined set of goals are achieved for first phase and it is signed off, and hence the peculiar name. All the methods and processes undertaken in the model are more visible.

Stages of the Waterfall Model Explained Requirement Analysis and Definition All possible requirements of the system to be developed are captured in this phase. Requirements are a set of functions and constraints that the end user (who will be using the system) expects from the system. The requirements are gathered from the end user at the start of the software development phase. These requirements are analyzed for their validity, and the possibility of incorporating the requirements in the system to be developed is also studied. Finally, a requirement specification document is created which serves the purpose of guideline for the next phase of the model. System and Software Design Before starting the actual coding phase, it is highly important to understand the requirements of the end user and also have an idea of how should the end product looks like. The requirement specifications from the first phase are studied in this phase and a system design is prepared. System design helps in specifying hardware and system requirements and also helps in defining the overall system architecture. The system design specifications serve as an input for the next phase of the model. Implementation and Unit Testing On receiving system design documents, the work is divided in modules/units and actual coding is started. The system is first developed in small programs called units, which are integrated in the next phase. Each unit is developed and tested for its functionality; this is referred to as unit testing. Unit testing mainly verifies if the modules/units meet their specifications. Integration and System Testing As specified above, the system is first divided into units which are developed and tested for their functions. These units are integrated into a complete system during integration phase and tested to check if all modules/units coordinate with each other, and the system as a whole behaves as per the specifications. After successfully testing the software, it is delivered to the customer. Operations & Maintenance This phase of the model is virtually a never-ending phase. Generally, problems with the system developed (which are not found during the development life cycle) come up after its practical use starts, so the issues related to the system are solved after deployment of the system. Not all the problems come into picture directly but they arise from time to time and need to be solved; hence this process is referred to as maintenance. Advantages There is clear compartmentalization of work and control in the model. Since there is compartmentalization, it is easier to set schedule for the tasks to be completed within a specified time period. The other advantage of this model is that only after the work for a particular phase is over, does the other phase start, due to which there is no overlapping of phases or the product does not have to go through different iterative steps.

This model is the easiest to implement in the eyes of most of the mangers, due to its linear model. Since the processes of this model are carried out in linear manner, the cost of resources is reduced to a large extent, which in turn helps in reducing the cost of the project considerably. Lastly, the documentation and testing happens at the end of each phase, which helps in maintaining the quality of the project. Disadvantages As it is very important to gather all possible requirements during the requirement gathering and analysis phase in order to properly design the system, not all requirements are received at once, the requirements from customer goes on getting added to the list even after the end of "requirement gathering and analysis" phase, this affects the system development process and its success in negative aspects. The problems with one phase are never solved completely during that phase and in fact many problems regarding a particular phase arise after the phase is signed off, this results in badly structured system as not all the problems (related to a phase) are solved during the same phase. The project is not partitioned in phases in a flexible way. As the requirements of the customer go on getting added to the list, not all the requirements are fulfilled, this results in development of almost unusable system. These requirements are then met in the newer version of the system; this increases the cost of system development. Although there are a few disadvantages of the model, it continues to be one of the widely-used software development approach in the industry. This can be attributed to the fact, that the advantages of this model outweigh its disadvantages in certain kind of projects. And lastly, if the team does not have a lot of experience, this model perfectly serves the purpose.

SPIRAL MODEL
The Spiral Life Cycle Model is a type of iterative software development model which is generally implemented in high risk projects. It was first proposed by Boehm. In this system development method, we combine the features of both, waterfall model and prototype model. In Spiral model we can arrange all the activities in the form of a spiral. Each loop in a spiral represents a development phase (and we can have any number of loops according to the project). Each loop has four sections or quadrants : 1. To determine the objectives, alternatives and constraints. We try to understand the product objectives, alternatives in design and constraints imposed because of cost, technology, schedule, etc. 2. Risk analysis and evaluation of alternatives. Here we try to find which other approaches can be implemented in order to fulfill the identified constraints. Operational and technical issues are addressed here. Risk mitigation is in focus in this phase. And evaluation of all these factors determines future action. 3. Execution of that phase of development. In this phase we develop the planned product. Testing is also done. In order to do development, waterfall or incremental approach can be implemented. 4. Planning the next phase. Here we review the progress and judge it considering all parameters. Issues which need to be resolved are identified in this phase and necessary steps are taken. Subsequent loops of spiral model involve similar phases. Analysis and engineering efforts are applied in this model. Large, expensive or complicated projects use this type of life cycle. If at any point of time one feels the risk involved in the project is a lot more than anticipated, one can abort it. Reviews at different phases can be done by an in-house person or by an external client.

Why spiral model is called meta model?


Spiral model is also called as meta-model because in a way it comprises of other models of SDLC. Both waterfall and prototype models are used in it. Here we do software development systematically over the loops (adhering to waterfall approach) and at the same time we make a prototype and show it to user after completion of various phase (just in case of prototype model). This way we are able to reduce risks as well as follow systematic approach. Now lets discuss the advantages and disadvantages of Spiral Model in detail.

Spiral Model Diagram

Advantages of Spiral Model


1) Spiral Life Cycle Model is one of the most flexible SDLC models in place. Development phases can be determined by the project manager, according to the complexity of the project. 2) Project monitoring is very easy and effective. Each phase, as well as each loop, requires a review from concerned people. This makes the model more transparent. 3) Risk management is one of the in-built features of the model, which makes it extra attractive compared to other models. 4) Changes can be introduced later in the life cycle as well. And coping with these changes isnt a very big headache for the project manager. 5) Project estimates in terms of schedule, cost etc become more and more realistic as the project moves forward and loops in spiral get completed. 6) It is suitable for high risk projects, where business needs may be unstable. 7) A highly customized product can be developed using this.

Disadvantages of Spiral Model


1) Cost involved in this model is usually high. 2) It is a complicated approach especially for projects with a clear SRS. 3) Skills required, to evaluate and review project from time to time, need expertise. 4) Rules and protocols should be followed properly to effectively implement this model. Doing so, through-out the span of project is tough. 5) Due to various customizations allowed from the client, using the same prototype in other projects, in future, is difficult.

6) It is not suitable for low risk projects. 7) Meeting budgetary and scheduling requirements is tough if this development process is followed. 8) Amount of documentation required in intermediate stages makes management of project very complex affair.

PROTOTYPING MODEL
After waterfall model, lets discuss what is prototyping model in Software Development is. Here, a prototype is made first and based on it final product is developed. A prototype is a model or a program which is not based on strict planning, but is an early approximation of the final product or software system. A prototype acts as a sample to test the process. From this sample we learn and try to build a better final product. Please note that this prototype may or may not be completely different from the final system we are trying to develop.

Need of Prototyping Model


This type of System Development Method is employed when it is very difficult to obtain exact requirements from the customer(unlike waterfall model, where requirements are clear). While making the model, user keeps giving feedbacks from time to time and based on it, a prototype is made. Completely built sample model is shown to user and based on his feedback, the SRS(System Requirements Specifications) document is prepared. After completion of this, a more accurate SRS is prepared, and now development work can start using Waterfall Model. Now lets discuss the disadvantages and advantages of the Prototype model in Software Development Method.

Prototyping Process Model

Advantages of Prototyping Model


1) When prototype is shown to the user, he gets a proper clarity and 'feel' of the functionality of the software and he can suggest changes and modifications. 2) This type of approach of developing the software is used for non-IT-literate people. They usually are not good at specifying their requirements, nor can tell properly about what they expect from the software. 3) When client is not confident about the developer's capabilities, he asks for a small prototype to be built. Based on this model, he judges capabilities of developer. 4) Sometimes it helps to demonstrate the concept to prospective investors to get funding for project.

5) It reduces risk of failure, as potential risks can be identified early and mitigation steps can be taken. 6) Iteration between development team and client provides a very good and conductive environment during project. 7) Time required to complete the project after getting final the SRS reduces, since the developer has a better idea about how he should approach the project.

Disadvantages of Prototyping Model:


1) Prototyping is usually done at the cost of the developer. So it should be done using minimal resources. It can be done using Rapid Application Development (RAD) tools. Please note sometimes the start-up cost of building the development team, focused on making prototype, is high. 2) Once we get proper requirements from client after showing prototype model, it may be of no use. That is why, sometimes we refer to the prototype as "Throw-away" prototype. 3) It is a slow process. 4) Too much involvement of client, is not always preferred by the developer. 5) Too many changes can disturb the rhythm of the development team.

NETWORK TOPOLOGY
In computer networking, topology refers to the layout of connected devices. This article introduces the standard topologies of networking.

Topology in Network Design


Think of a topology as a network's virtual shape or structure. This shape does not necessarily correspond to the actual physical layout of the devices on the network. For example, the computers on a home LAN may be arranged in a circle in a family room, but it would be highly unlikely to find a ring topology there. Network topologies are categorized into the following basic types: bus ring star tree mesh More complex networks can be built as hybrids of two or more of the above basic topologies.

Bus Topology
Bus networks (not to be confused with the system bus of a computer) use a common backbone to connect all devices. A single cable, the backbone functions as a shared communication medium that devices attach or tap into with an interface connector. A device wanting to communicate with another device on the network sends a broadcast message onto the wire that all other devices see, but only the intended recipient actually accepts and processes the message. Ethernet bus topologies are relatively easy to install and don't require much cabling compared to the alternatives. 10Base-2 ("ThinNet") and 10Base-5 ("ThickNet") both were popular Ethernet cabling options many years ago for bus topologies. However, bus networks work best with a limited number of devices. If more than a few dozen computers are added to a network bus, performance problems will likely result. In addition, if the backbone cable fails, the entire network effectively becomes unusable.

Advantages of a Linear Bus Topology


Easy to connect a computer or peripheral to a linear bus. Requires less cable length than a star topology.

Disadvantages of a Linear Bus Topology


Entire network shuts down if there is a break in the main cable. Terminators are required at both ends of the backbone cable. Difficult to identify the problem if the entire network shuts down. Not meant to be used as a stand-alone solution in a large building.

Ring Topology
In a ring network, every device has exactly two neighbors for communication purposes. All messages travel through a ring in the same direction (either "clockwise" or "counterclockwise"). A failure in any cable or device breaks the loop and can take down the entire network. To implement a ring network, one typically uses FDDI, SONET, or Token Ring technology. Ring topologies are found in some office buildings or school campuses.

Star Topology
Many home networks use the star topology. A star network features a central connection point called a "hub node" that may be a network hub, switch or router. Devices typically connect to the hub with Unshielded Twisted Pair (UTP) Ethernet. Compared to the bus topology, a star network generally requires more cable, but a failure in any star network cable will only take down one computer's network access and not the entire LAN. (If the hub fails, however, the entire network also fails.)

Advantages of a Star Topology


Easy to install and wire. No disruptions to the network when connecting or removing devices. Easy to detect faults and to remove parts.

Disadvantages of a Star Topology


Requires more cable length than a linear topology. If the hub, switch, or concentrator fails, nodes attached are disabled. More expensive than linear bus topologies because of the cost of the hubs, etc.

Tree Topology
Tree topologies integrate multiple star topologies together onto a bus. In its simplest form, only hub devices connect directly to the tree bus, and each hub functions as the root of a tree of devices. This bus/star hybrid approach supports future expandability of the network much better than a bus (limited in the number of devices due to the broadcast traffic it generates) or a star (limited by the number of hub connection points) alone.

Advantages of a Tree Topology


Point-to-point wiring for individual segments. Supported by several hardware and software venders.

Disadvantages of a Tree Topology


Overall length of each segment is limited by the type of cabling used. If the backbone line breaks, the entire segment goes down. More difficult to configure and wire than other topologies.

Mesh Topology
Mesh topologies involve the concept of routes. Unlike each of the previous topologies, messages sent on a mesh network can take any of several possible paths from source to destination. (Recall that even in a ring, although two cable paths exist, messages can only travel in one direction.) Some WANs, most notably the Internet, employ mesh routing.

A mesh network in which every device connects to every other is called a full mesh. As shown in the illustration below, partial mesh networks also exist in which some devices connect only indirectly to others.

Network Devices
Repeaters, Bridges, Routers, and Gateways
Network Repeater
A repeater connects two segments of your network cable. It retimes and regenerates the signals to proper amplitudes and sends them to the other segments. When talking about, ethernet topology, you are probably talking about using a hub as a repeater. Repeaters require a small amount of time to regenerate the signal. This can cause a propagation delay which can affect network communication when there are several repeaters in a row. Many network architectures limit the number of repeaters that can be used in a row. Repeaters work only at the physical layer of the OSI network model.

Bridge
A bridge reads the outermost section of data on the data packet, to tell where the message is going. It reduces the traffic on other network segments, since it does not send all packets. Bridges can be programmed to reject packets from particular networks. Bridging occurs at the data link layer of the OSI model, which means the bridge cannot read IP addresses, but only the outermost hardware address of the packet. In our case the bridge can read the ethernet data which gives the hardware address of the destination address, not the IP address. Bridges forward all broadcast messages. Only a special bridge called a translation bridge will allow two networks of different architectures to be connected. Bridges do not normally allow connection of networks with different architectures. The hardware address is also called the MAC (media access control) address. To determine the network segment a MAC address belongs to, bridges use one of:
Transparent Bridging - They build a table of addresses (bridging table) as they receive packets. If the address is not in the bridging table, the packet is forwarded to all segments other than the one it came from. This type of bridge is used on ethernet networks. Source route bridging - The source computer provides path information inside the packet. This is used on Token Ring networks.

Network Router
A router is used to route data packets between two networks. It reads the information in each packet to tell where it is going. If it is destined for an immediate network it has access to, it will strip the outer packet, readdress the packet to the proper ethernet address, and transmit it on that network. If it is destined for another network and must be sent to another router, it will repackage the outer packet to be received by the next router and send it to the next router. The section on routing explains the theory behind this and how routing tables are used to help determine packet destinations. Routing occurs at the network layer of the OSI model. They can connect networks with different architectures such as Token Ring and Ethernet. Although they

can transform information at the data link level, routers cannot transform information from one data format such as TCP/IP to another such as IPX/SPX. Routers do not send broadcast packets or corrupted packets. If the routing table does not indicate the proper address of a packet, the packet is discarded.

Brouter
There is a device called a brouter which will function similar to a bridge for network transport protocols that are not routable, and will function as a router for routable protocols. It functions at the network and data link layers of the OSI network model.

Gateway
A gateway can translate information between different network data formats or network architectures. It can translate TCP/IP to AppleTalk so computers supporting TCP/IP can communicate with Apple brand computers. Most gateways operate at the application layer, but can operate at the network or session layer of the OSI model. Gateways will start at the lower level and strip information until it gets to the required level and repackage the information and work its way back toward the hardware layer of the OSI model. To confuse issues, when talking about a router that is used to interface to another network, the word gateway is often used. This does not mean the routing machine is a gateway as defined here, although it could be.

The Impact of IT on Organizations


Information technology (IT) is dramatically changing the business landscape. Although organization cultures and business strategies shape the use of IT in organizations, more often the influence is stronger the other way round. IT significantly affects strategic options and creates opportunities and issues that managers need to address in many aspects of their business. This page outlines some of the key impacts of technology and the implications for management on: Business strategy - collapsing time and distance, enabling electronic commerce Organization Culture - encouraging the free flow of information Organization Structures - making networking and virtual corporations a reality Management Processes - providing support for complex decision making processes Work - dramatically changing the nature of professional, and now managerial work The workplace - allowing work from home and on the move, as in telework There is also the outline of an executive presentation, that has been used to increase awareness of these issues.

The Impacts
Business Strategy IT creates new opportunities for innovation in products and services. Services which used to be delivered in person can now be delivered over networks. Among the key levers are: resequencing: including parallel processing of data-bases simultaneity: making information instantly available in several systems (e.g via OLE) time extension: offering 24 hour a day; 365 days a year service portability: taking service and products closer to the user reusability: using information captured for one purpose (e.g. transactions), and using for others (e.g. customer targeting) Organization Culture Newer types of IT such as electronic mail and groupware are creating significant changes in the way that information flows around group ware, and between them and their customers and suppliers. It can hasten the development of more open and innovative cultures. However, as experts like Davenport warns, and surveys from companies like Reuters confirm, the notion that "information is power" still reigns large in many orggroup warelso, our experience shows that many new systems fail to become accepted by their users, because the systems developers have not been culturally sensitive to the department or group ware, in which the new systems are to be used. Organization Structures For many years it has been argued that IT will enable larger spans of control and the flattening of group ware. This has at last happened, but due as much to initiatives like BPR

(business process reengineering) and the drive to cut costs. Research on whether IT encourages cencentralization decdecentralizations produced ambivalent results. Many companies have cencentralizedckroom operations (for efficiency) while at the same time decdecentralizingher activities. It now seems clear that IT enables a greater variety of structures. In particular it enables more flexible and fluid structures - networked structures, dispersed team and teams that come and go as needs change (as in the virtual corporation). Management Processes IT is rapidly entering the era where it supports unstructured management processes as well as highly routinized business processes. It provides more effective ways of accessing information from multiple sources, including use of external information on databases and the Internet. However, group decision support systems that operate in a meeting room environment can help enhance decision making, but it does need someone who is an expert facilitator to help the group master the technique of structured discussion. Work IT is dramatically changing the nature of professional work. There are few offices where professional do not make use of personal computers, and in many jobs involving extensive information and knowledge based work, the use of the computer is often a core activity. Becoming effective not only requires traditional skills of organizing, thinking, writing etc., but knowing how best to use the power of IT for researching sources, accessing information, connecting to experts, communicating ideas and results, and packaging the knowledge (asset) for reuse. One aspect of this is the need for hybrid managers - people who are competent at both their discipline and IT. The Workplace The way in which IT diminishes the effect of distance means that it creates a variety of options for reorganizing the workplace. At a basic level, it can provide more flexibility in the office, allowing desk sharing and a degree of location independence within a building (this will develop as CTI (Computer Telephony Integration) and wireless PCs become more firmly established. At another level it permits the dispersion of work teams, thus saving costs of relocation and travel. It has also created the mobile professional and also allows people to work effectively from home.

IT Enabled Services (Business Process Outsourcing) IT Enabled services (ITES), also called web enabled services or remote services or Tele-working, covers the entire gamut of operations which exploit information technology for improving efficiency of an organization. These services provide a wide range of career options that include opportunities in call Centres, medical transcription, medical billing and coding, back office operations, revenue claims processing, legal databases, content development, payrolls, logistics management, GIS (Geographical Information System), HR services, web services etc. Introduction About IT Enabled Services (BPO) Business Process Outsourcing (BPO) is defined as the delegation of one or more business processes to an external service provider, who in turn owns, manages and administers the selected processes, based on defined and measurable performance metrics. Since the late 90's, BPO momentum is continuing to grow worldwide as factors that force companies to focus on core competencies intensify. The factors include Shortage of skilled labour in developed countries, Mergers and Acquisitions, Competitions imposed by economic globalization. Call Centres have immense business potential as the current global business of ITES is approximately US$ 10 billion and is estimated to grow to US $ 140 billion by 2008 and to US$ 2000 billion by 2010. As per NASSCOM projections, IT enabled services can generate revenues of US$17 billion and provide employment for 1.1 million people in the next eight years. BPO Segment The BPO services can be categorized as Customer Care, Finance, Human Resource, Payment Services, Administration, and Content Development. Various services offered in these segments include: Customer care : including database marketing, customer analytic, telesales / telemarketing, inbound call centre, web sales and marketing, sales and marketing administration. Finance: including billing services, accounting transactions, tax consulting and compliance, risk management, financial reporting, financial analysis. Human Resources : including benefits administration, education and training, recruiting and staffing, payroll services, hiring-administration, records management. Payment services : credit/debit card services, Cheque processing, transaction processing Administration : including tax processing, claims processing, asset management, document management, transcription and translation. Content development : including engineering, design, animation, network Consultancy and management, biotech research. The services like GIS, Digitisation, web applications are also getting included in the list.

Call centre
A call centre or call center is a centralised office used for the purpose of receiving or transmitting a large volume of requests by telephone. An inbound call centre is operated by a company to administer incoming product support or information inquiries from consumers. Outbound call centers are operated for telemarketing, solicitation of charitable or political donations and debt collection. In addition to a call centre, collective handling of letter, fax,live chat, and e-mail at one location is known as a contact centre. A call centre is operated through an extensive open workspace for call centre agents, with work stations that include a computer for each agent, a telephone set /headset connected to a telecom switch, and one or more supervisor stations. It can be independently operated or networked with additional centres, often linked to a corporate computer network, including mainframes, microcomputers and LANs. Increasingly, the voice and data pathways into the centre are linked through a set of new technologies called computer telephony integration (CTI). A contact centre, also known as customer interaction centre is a central point of any organization from which all customer contacts are managed. Through contact centres, valuable information about company are routed to appropriate people, contacts to be tracked and data to be gathered. It is generally a part of companys customer relationship management (CRM). Today, customers contact companies by telephone, email, oline chat, fax, and instant message. Technology Call centre technology is subject to improvements and innovations. Some of these technologies include speech recognition software to allow computers to handle first level of customer support, text mining and natural language processing to allow better customer handling, agent training by automatic mining of best practices from past interactions, support automation and many other technologies to improve agent productivity and customer satisfaction.[1] Automatic lead selection or lead steering is also intended to improve efficiencies,[2] both for inbound and outbound campaigns, whereby inbound calls are intended to quickly land with the appropriate agent to handle the task, whilst minimizing wait times and long lists of irrelevant options for people calling in, as well as for outbound calls, where lead selection allows management to designate what type of leads go to which agent based on factors including skill, socioeconomic factors and past performance and percentage likelihood of closing a sale per lead. The concept of the Universal Queue standardizes the processing of communications across multiple technologies such as fax, phone, and email whilst the concept of a Virtual queue provides callers with an alternative to waiting on hold when no agents are available to handle inbound call demand.

Dynamics Calls may be inbound or outbound. Inbound calls are made by consumers, for example to obtain information, report a malfunction, or ask for help. In contrast, outbound calls are made by agents to consumers, usually for sales purposes (telemarketing). One can combine inbound and outbound campaigns. Varieties Some variations of call centre models are listed below: Contact centre Supports interaction with customers over a variety of media, including but not necessarily limited to telephony, e-mail and internet chat. Inbound call centre - Exclusively or predominantly handles inbound calls (calls initiated by the customer). Outbound call centre - One in which call centre agents make outbound calls to customers or sales leads. Blended call centre - Combining automatic call distribution for incoming calls with predictive dialling for outbound calls, it makes more efficient use of agent time as each type of agent (inbound or outbound) can handle the overflow of the other. Telephone answering service - A more personalised version of the call centre, where agents get to know more about their customers and their callers; and therefore look after calls just as if based in their customers office.

Geographic information system


Geographic information system (GIS) is a system designed to capture, store, manipulate, analyze, manage, and present all types of geographical data. The acronym GIS is sometimes used for geographical information science or geospatial information studies to refer to the academic discipline or career of working with geographic information systems.[1] In the simplest terms, GIS is the merging of cartography, statistical analysis, and database technology. A GIS can be thought of as a systemit digitally creates and "manipulates" spatial areas that may be jurisdictional, purpose, or application-oriented. Generally, a GIS is custom-designed for an organization. Hence, a GIS developed for an application, jurisdiction, enterprise, or purpose may not be necessarily interoperable or compatible with a GIS that has been developed for some other application, jurisdiction, enterprise, or purpose. What goes beyond a GIS is a spatial data infrastructure, a concept that has no such restrictive boundaries.

In a general sense, the term describes any information system that integrates, stores, edits, analyzes, shares, and displays geographic information for informing decision making. GIS applications are tools that allow users to create interactive queries (usercreated searches), analyze spatial information, edit data in maps, and present the results of all these operations.[2] Geographic information science is the science underlying geographic concepts, applications, and systems. GIS techniques and technology Modern GIS technologies use digital information, for which various digitized data creation methods are used. The most common method of data creation is digitization, where a hard copy map or survey plan is transferred into a digital medium through the use of a CAD program, and geo-referencing capabilities. With the wide availability of ortho-rectified imagery (both from satellite and aerial sources), heads-up digitizing is becoming the main avenue through which geographic data is extracted. Heads-up digitizing involves the tracing of geographic data directly on top of the aerial imagery instead of by the traditional method of tracing the geographic form on a separate digitizing tablet (heads-down digitizing).

Data representation GIS data represents real objects (such as roads, land use, elevation, trees, waterways, etc.) with digital data determining the mix. Real objects can be divided into two abstractions: discrete objects (e.g., a house) and continuous fields (such as rainfall amount, or elevations). Traditionally, there are two broad methods used to store data in a GIS for both kinds of abstractions mapping references: raster images and vector. Points, lines, and polygons are the stuff of mapped location attribute references. A new hybrid method of storing data is that of identifying point clouds, which combine threedimensional points with RGB information at each point, returning a "3D color image". GIS thematic maps then are becoming more and more realistically visually descriptive of what they set out to show or determine.

Data capture Example of hardware for mapping (GPS and laser rangefinder) and data collection (rugged computer). The current trend for geographical information system (GIS) is that accurate mapping and data analysis are completed while in the field. Depicted hardware (field-map technology) is used mainly for forest inventories, monitoring and mapping. Data captureentering information into the systemconsumes much of the time of GIS practitioners. There are a variety of methods used to enter data into a GIS where it is stored in a digital format.

Existing data printed on paper or PET film maps can be digitized or scanned to produce digital data. A digitizer produces vector data as an operator traces points, lines, and polygon boundaries from a map. Scanning a map results in raster data that could be further processed to produce vector data. Survey data can be directly entered into a GIS from digital data collection systems on survey instruments using a technique called coordinate geometry (COGO). Positions from a global navigation satellite system (GNSS) like Global Positioning System can also be collected and then imported into a GIS. A current trend in data collection gives users the ability to utilize field computers with the ability to edit live data using wireless connections or disconnected editing sessions. This has been enhanced by the availability of low-cost mapping-grade GPS units with decimeter accuracy in real time. This eliminates the need to post process, import, and update the data in the office after fieldwork has been collected. This includes the ability to incorporate positions collected using a laser rangefinder. New technologies also allow users to create maps as well as analysis directly in the field, making projects more efficient and mapping more accurate. Remotely sensed data also plays an important role in data collection and consist of sensors attached to a platform. Sensors include cameras, digital scanners and LIDAR, while platforms usually consist of aircraft and satellites. Recently with the development of Miniature UAVs, aerial data collection is becoming possible at much lower costs, and on a more frequent basis. For example, the Aeryon Scout was used to map a 50acre area with a Ground sample distance of 1 inch (2.54 cm) in only 12 minutes.[17] The majority of digital data currently comes from photo interpretation of aerial photographs. Soft-copy workstations are used to digitize features directly from stereo pairs of digital photographs. These systems allow data to be captured in two and three dimensions, with elevations measured directly from a stereo pair using principles of photogrammetry. Analog aerial photos must be scanned before being entered into a soft-copy system, for high-quality digital cameras step is skipped. Satellite remote sensing provides another important source of spatial data. Here satellites use different sensor packages to passively measure the reflectance from parts of the electromagnetic spectrum or radio waves that were sent out from an active sensor such as radar. Remote sensing collects raster data that can be further processed using different bands to identify objects and classes of interest, such as land cover. When data is captured, the user should consider if the data should be captured with either a relative accuracy or absolute accuracy, since this could not only influence how information will be interpreted but also the cost of data capture. After entering data into a GIS, the data usually requires editing, to remove errors, or further processing. For vector data it must be made "topologically correct" before it can be used for some advanced analysis. For example, in a road network, lines must connect with nodes at an intersection. Errors such as undershoots and overshoots must also be removed. For scanned maps, blemishes on the source map may need to be removed from the resulting raster. For example, a fleck of dirt might connect two lines that should not be connected.

What Is a DBMS? A very large, integrated collection of data. Models real-world enterprise. Entities (e.g., students, courses) Relationships (e.g., Madonna is taking CS564) A Database Management System (DBMS) is a software package designed to store and manage databases. Why Use a DBMS? Data independence and efficient access Reduced application development time. Data integrity and security. Uniform data administration. Concurrent access, recovery from crashes. Why Study Databases?? Shift from computation to information at the low end: scramble to webspace (a mess!) at the high end: scientific applications Datasets increasing in diversity and volume. Digital libraries, interactive video, Human Genome project, EOS project ... need for DBMS exploding v DBMS encompasses most of CS OS, languages, theory, AI, multimedia, logic Data Models A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using the a given data model. The relational model of data is the most widely used model today. Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields. Levels of Abstraction Many views, single conceptual (logical) schema and physical schema. Views describe how users see the data. Conceptual schema defines logical structure Physical schema describes the files and indexes used. * Schemas are defined using DDL; data is modified/queried using DML.

Example: University Database Conceptual schema: Students(sid: string, name: string, login: string, age: integer, gpa:real) Courses(cid: string, cname:string, credits:integer) Enrolled(sid:string, cid:string, grade:string) Physical schema: Relations stored as unordered files. Index on first column of Students. External Schema (View): Course_info(cid:string,enrollment:integer) Data Independence Applications insulated from how data is structured and stored. Logical data independence: Protection from changes in logical structure of data. Physical data independence: Protection from changes in physical structure of data. Structure of a DBMS A typical DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. This is one of several possible architectures; each system has its own variations.

An Introduction to Data Mining

Discovering hidden value in your data warehouse Overview


Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems. Data mining tools can answer business questions that traditionally were too time consuming to resolve. They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations. Most companies already collect and refine massive quantities of data. Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of existing information resources, and can be integrated with new products and systems as they are brought on-line. When implemented on high performance client/server or parallel processing computers, data mining tools can analyze massive databases to deliver answers to questions such as, "Which clients are most likely to respond to my next promotional mailing, and why?"

The Foundations of Data Mining


Data mining is ready for application in the business community because it is supported by three technologies that are now sufficiently mature: Massive data collection Powerful multiprocessor computers Data mining algorithms

The Scope of Data Mining


Data mining derives its name from the similarities between searching for valuable business information in a large database for example, finding linked products in gigabytes of store scanner data and mining a mountain for a vein of valuable ore. Both processes require either sifting through an immense amount of material, or intelligently probing it to find exactly where the value resides. Given databases of sufficient size and quality, data mining technology can generate new business opportunities by providing these capabilities: Automated prediction of trends and behaviors. Data mining automates the process of finding predictive information in large databases. Questions that traditionally required extensive hands-on analysis can now be answered directly from the data quickly. A typical example of a predictive problem is targeted marketing. Data mining uses data on past promotional mailings to identify the targets most likely to maximize return on investment in future mailings. Other predictive problems include forecasting bankruptcy and other forms of default, and identifying segments of a population likely to respond similarly to given events.

Automated discovery of previously unknown patterns. Data mining tools sweep through databases and identify previously hidden patterns in one step. An example of pattern discovery is the analysis of retail sales data to identify seemingly unrelated products that are often purchased together. Other pattern discovery problems include detecting fraudulent credit card transactions and identifying anomalous data that could represent data entry keying errors. Databases can be larger in both depth and breadth: More columns. Analysts must often limit the number of variables they examine when doing hands-on analysis due to time constraints. Yet variables that are discarded because they seem unimportant may carry information about unknown patterns. High performance data mining allows users to explore the full depth of a database, without preselecting a subset of variables. More rows. Larger samples yield lower estimation errors and variance, and allow users to make inferences about small but important segments of a population. The most commonly used techniques in data mining are: Artificial neural networks: Non-linear predictive models that learn through training and resemble biological neural networks in structure. Decision trees: Tree-shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset. Specific decision tree methods include Classification and Regression Trees (CART) and Chi Square Automatic Interaction Detection (CHAID) . Genetic algorithms: Optimization techniques that use processes such as genetic combination, mutation, and natural selection in a design based on the concepts of evolution. Nearest neighbor method: A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) most similar to it in a historical dataset (where k 1). Sometimes called the k-nearest neighbor technique. Rule induction: The extraction of useful if-then rules from data based on statistical significance. Many of these technologies have been in use for more than a decade in specialized analysis tools that work with relatively small volumes of data. These capabilities are now evolving to integrate directly with industry-standard data warehouse and OLAP platforms. The appendix to this white paper provides a glossary of data mining terms.

How Data Mining Works

The technique that is used to perform these feats in data mining is called modeling. Modeling is simply the act of building a model in one situation where you know the answer and then applying it to another situation that you don't.

An Architecture for Data Mining


To best apply these advanced techniques, they must be fully integrated with a data warehouse as well as flexible interactive business analysis tools. Many data mining tools currently operate outside of the warehouse, requiring extra steps for extracting, importing, and analyzing the data. Furthermore, when new insights require operational implementation, integration with the warehouse simplifies the application of results from data mining. The resulting analytic data warehouse can be applied to improve business processes throughout the organization, in areas such as promotional campaign management, fraud detection, new product rollout, and so on. Figure 1 illustrates an architecture for advanced analysis in a large data warehouse.

Figure 1 - Integrated Data Mining Architecture

The ideal starting point is a data warehouse containing a combination of internal data tracking all customer contact coupled with external market data about competitor activity. Background information on potential customers also provides an excellent basis for prospecting. This warehouse can be implemented in a variety of relational database systems: Sybase, Oracle, Redbrick, and so on, and should be optimized for flexible and fast data access. An OLAP (On-Line Analytical Processing) server enables a more sophisticated end-user business model to be applied when navigating the data warehouse. The multidimensional structures allow the user to analyze the data as they want to view their business summarizing by product line, region, and other key perspectives of their business. The Data Mining Server must be integrated with the data warehouse and the OLAP server to embed ROI-focused business analysis directly into this infrastructure. An advanced, process-centric metadata template defines the data mining objectives for specific business issues like campaign management, prospecting, and promotion optimization. Integration with the data warehouse enables operational decisions to be directly implemented and tracked. As the warehouse grows

with new decisions and results, the organization can continually mine the best practices and apply them to future decisions. This design represents a fundamental shift from conventional decision support systems. Rather than simply delivering data to the end user through query and reporting software, the Advanced Analysis Server applies users business models directly to the warehouse and returns a proactive analysis of the most relevant information. These results enhance the metadata in the OLAP Server by providing a dynamic metadata layer that represents a distilled view of the data. Reporting, visualization, and other analysis tools can then be applied to plan future actions and confirm the impact of those plans.

Profitable Applications
A wide range of companies have deployed successful applications of data mining. While early adopters of this technology have tended to be in information-intensive industries such as financial services and direct mail marketing, the technology is applicable to any company looking to leverage a large data warehouse to better manage their customer relationships. Two critical factors for success with data mining are: a large, well-integrated data warehouse and a well-defined understanding of the business process within which data mining is to be applied (such as customer prospecting, retention, campaign management, and so on). Some successful application areas include: A pharmaceutical company can analyze its recent sales force activity and their results to improve targeting of high-value physicians and determine which marketing activities will have the greatest impact in the next few months. The data needs to include competitor market activity as well as information about the local health care systems. The results can be distributed to the sales force via a wide-area network that enables the representatives to review the recommendations from the perspective of the key attributes in the decision process. The ongoing, dynamic analysis of the data warehouse allows best practices from throughout the organization to be applied in specific sales situations. A credit card company can leverage its vast warehouse of customer transaction data to identify customers most likely to be interested in a new credit product. Using a small test mailing, the attributes of customers with an affinity for the product can be identified. Recent projects have indicated more than a 20-fold decrease in costs for targeted mailing campaigns over conventional approaches. A diversified transportation company with a large direct sales force can apply data mining to identify the best prospects for its services. Using data mining to analyze its own customer experience, this company can build a unique segmentation identifying the attributes of high-value prospects. Applying this segmentation to a general business database such as those provided by Dun & Bradstreet can yield a prioritized list of prospects by region. A large consumer package goods company can apply data mining to improve its sales process to retailers. Data from consumer panels, shipments, and competitor activity can be applied to understand the reasons for brand and store switching. Through this analysis,

the manufacturer can select promotional strategies that best reach their target customer segments. Each of these examples have a clear common ground. They leverage the knowledge about customers implicit in a data warehouse to reduce costs and improve the value of customer relationships. These organizations can now focus their efforts on the most important (profitable) customers and prospects, and design targeted marketing strategies to best reach them.

Data Warehousing Systems


A data warehousing system can perform advanced analyses of operational data without impacting operational systems. OLTP is very fast and efficient at recording the business transactions - not so good at providing answers to high-level strategic questions.

Component Systems Legacy Systems


Any information system currently in use that was built using previous technology generations. Most legacy Systems are operational in nature, largely because the automation of transactionoriented business process had long been the priority of IT projects.

Source Systems
Any system from which data is taken for a data warehouse. A source system is often called a legacy system in a mainframe environment. Operational Data Stores (ODS) An ODS is a collection of integrated databases designed to support the monitoring of operations. Unlike the databases of OLTP applications (that are function oriented), the ODS contains subject oriented, volatile, and current enterprise-wide detailed information. It serves as a system of record that provides comprehensive views of data in operational sources. Like data warehouses, ODSs are integrated and subject-oriented. However, an ODS is always current and is constantly updated. The ODS is an ideal data source for a data warehouse, since it already contains integrated operational data as of a given point in time. In short, ODS is an integrated collection of clean data destined for the data warehouse. Definition Data Warehouses are mostly populated with periodic migrations of data from operational systems. The second source is made up of external, frequently purchased, databases. Examples of this data would include lists of income and demographic information. This purchased information is linked with internal data about customers to develop a good customer profile. A Data Warehouse is a

Subject-oriented Integrated Time-variant Non-volatile collection of data in support of management decisions.

Subject Oriented
OLTP databases usually hold information about small subsets of the organization. For example, a retailer might have separate order entry systems and databases for retail, catalog, and outlet sales. Each system will support queries about the information it captures. But if somebody wants to find out details of all sales, then these separate systems are not adequate. To address this type of situation, your data warehouse database should be subject-oriented, organized into subject areas like sales, rather than around OLTP data sources. A data warehouse is organized around major subjects such as customer, products, sales, etc. Data are organized according to subject instead of application. For exmple, an insurance company using a data warehouse would organize their data by customer, premium, and claim instead of by different products (auto, life, property etc.).

Integrated
A data warehouse is usually constructed by integrating multiple, heterogeneous sources, such as relational databases, flat files, and OLTP files. When data resides in many separate applications in the operational environment, the encoding of data is often inconsistent. For example, in the above system, the retail system uses a numeric 7-digit code for products, the outlet system code consists of 9 alpha-numerics, and the catalog system uses 4 alphabets and 4 numerics. To create a useful subject area, the source data must be integrated. There is no need to change the coding in these systems, but there must be some mechanism to modify the data coming into the data warehouse and assign a common coding scheme.

Nonvolatile
Unlike operational databases, warehouses primarily support reporting, not data capture. A data warehouse is always a physically separate store of data. Due to this separation, data warehouses do not require transaction processing, recovery, concurrency control etc. The data are not updated or changed in any way once they enter the data warehouse, but are only loaded, refreshed and accessed for queries.

Time Variant Data are stored in a data warehouse to provide historical perspective. Every key structure in the data warehouse contains, implicitly or explicitly, an element of time. A data warehouse generally stores data that is 5-10 years old, to be used for comparisons, trends, and forecasting. Operational Systems vs Data Warehousing Systems Data Warehouse Operational Holds current data Data is dynamic Read/Write accesses Repetitive processing Transaction driven Application oriented Used by clerical staff for day-to-day operations Normalized data model (ER model) Must be optimized for writes and small queries. Advantages of Data Warehousing Potential high Return on Investment Competitive Advantage Increased Productivity of Corporate Decision Makers Problems with Data Warehousing Underestimation of resources for data loading Holds historic data Data is largely static Read only accesses Adhoc complex queries Analysis driven Subject oriented Used by top managers for analysis Denormalized data model (Dimensional model) Must be optimized for queries involving a large portion of the warehouse.

Hidden problems with source systems Required data not captured Increased end-user demands High maintenance Long duration projects Complexity of integration Data Warehouse Architecture* A typical data warehousing architecture is illustrated below:

DATA WAREHOUSE COMPONENTS & ARCHITECTURE


The data in a data warehouse comes from operational systems of the organization as well as from other external sources. These are collectively referred to as source systems. The data extracted from source systems is stored in a area called data staging area, where the data is cleaned, transformed, combined, deduplicated to prepare the data for us in the data warehouse. The data staging area is generally a collection of machines where simple activities like sorting and sequential processing takes place. The data staging area does not provide any query or presentation services. As soon as a system provides query or presentation services, it is categorized as a presentation server. A presentation server is the target machine on which the data is loaded from the data staging area organized and stored for direct querying by end users, report writers and other applications. The three different kinds of systems that are required for a data warehouse are: 1. Source Systems 2. Data Staging Area 3. Presentation servers

The data travels from source systems to presentation servers via the data staging area. The entire process is popularly known as ETL (extract, transform, and load) or ETT (extract, transform, and transfer). Oracles ETL tool is called Oracle Warehouse Builder (OWB) and MS SQL Servers ETL tool is called Data Transformation Services (DTS). A typical architecture of a data warehouse is shown below: Each component and the tasks performed by them are explained below: OPERATIONAL DATA The sources of data for the data warehouse is supplied from:
o o o

The data from the mainframe systems in the traditional network and hierarchical format. Data can also come from the relational DBMS like Oracle, Informix. In addition to these internal data, operational data also includes external data obtained from commercial databases and databases associated with supplier and customers.

LOAD MANAGER The load manager performs all the operations associated with extraction and loading data into the data warehouse. These operations include simple transformations of the data to prepare the data for entry into the warehouse. The size and complexity of this component will vary between data warehouses and may be constructed using a combination of vendor data loading tools and custom built programs. WAREHOUSE MANAGER The warehouse manager performs all the operations associated with the management of data in the warehouse. This component is built using vendor data management tools and custom built programs. The operations performed by warehouse manager include:
o o o o o o

Analysis of data to ensure consistency Transformation and merging the source data from temporary storage into data warehouse tables Create indexes and views on the base table. Denormalization Generation of aggregation Backing up and archiving of data

In certain situations, the warehouse manager also generates query profiles to determine which indexes ands aggregations are appropriate. QUERY MANAGER

The query manager performs all operations associated with management of user queries. This component is usually constructed using vendor end-user access tools, data warehousing monitoring tools, database facilities and custom built programs. The complexity of a query manager is determined by facilities provided by the end-user access tools and database. DETAILED DATA This area of the warehouse stores all the detailed data in the database schema. In most cases detailed data is not stored online but aggregated to the next level of details. However the detailed data is added regularly to the warehouse to supplement the aggregated data.

LIGHTLY AND HIGHLY SUMMERIZED DATA The area of the data warehouse stores all the predefined lightly and highly summarized (aggregated) data generated by the warehouse manager. This area of the warehouse is transient as it will be subject to change on an ongoing basis in order to respond to the changing query profiles. The purpose of the summarized information is to speed up the query performance. The summarized data is updated continuously as new data is loaded into the warehouse. ARCHIVE AND BACK UP DATA This area of the warehouse stores detailed and summarized data for the purpose of archiving and back up. The data is transferred to storage archives such as magnetic tapes or optical disks. META DATA The data warehouse also stores all the Meta data (data about data) definitions used by all processes in the warehouse. It is used for variety of purposed including:
o o o

The extraction and loading process Meta data is used to map data sources to a common view of information within the warehouse. The warehouse management process Meta data is used to automate the production of summary tables. As part of Query Management process Meta data is used to direct a query to the most appropriate data source.

The structure of Meta data will differ in each process, because the purpose is different. More about Meta data will be discussed in the later Lecture Notes. END-USER ACCESS TOOLS The principal purpose of data warehouse is to provide information to the business managers for strategic decision-making. These users interact with the warehouse using end user access tools. The examples of some of the end user access tools can be:

o o o o o

Reporting and Query Tools Application Development Tools Executive Information Systems Tools Online Analytical Processing Tools Data Mining Tools

THE E T L (EXTRACT TRANSFORMATION LOAD) PROCESS In this section we will discussed about the 4 major process of the data warehouse. They are extract (data from the operational systems and bring it to the data warehouse),transform(the data into internal format and structure of the data warehouse),cleanse (to make sure it is of sufficient quality to be used for decision making) and load (cleanse data is put into the data warehouse). The four processes from extraction through loading often referred collectively as Data Staging.
EXTRACT

Some of the data elements in the operational database can be reasonably be expected to be useful in the decision making, but others are of less value for that purpose. For this reason, it is necessary to extract the relevant data from the operational database before bringing into the data warehouse. Many commercial tools are available to help with the extraction process. Data Junction is one of the commercial products. The user of one of these tools typically has an easyto-use windowed interface by which to specify the following:
o o o o o

Which files and tables are to be accessed in the source database? Which fields are to be extracted from them? This is often done internally by SQL Select statement. What are those to be called in the resulting database? What is the target machine and database format of the output? On what schedule should the extraction process be repeated?

TRANSFORM The operational databases developed can be based on any set of priorities, which keeps changing with the requirements. Therefore those who develop data warehouse based on these databases are typically faced with inconsistency among their data sources. Transformation process deals with rectifying any inconsistency (if any). One of the most common transformation issues is Attribute Naming Inconsistency. It is common for the given data element to be referred to by different data names in different databases. Employee Name may be EMP_NAME in one database, ENAME in the other. Thus one set of Data Names are picked and used consistently in the data warehouse. Once all the data elements have right names, they must be converted to common formats. The conversion may encompass the following: Characters must be converted ASCII to EBCDIC or vise versa. Mixed Text may be converted to all uppercase for consistency. Numerical data must be converted in to a common format. Data Format has to be standardized.

Measurement may have to convert. (Rs/ $) Coded data (Male/ Female, M/F) must be converted into a common format. All these transformation activities are automated and many commercial products are available to perform the tasks. DataMAPPER from Applied Database Technologies is one such comprehensive tool.
CLEANSING

Information quality is the key consideration in determining the value of the information. The developer of the data warehouse is not usually in a position to change the quality of its underlying historic data, though a data warehousing project can put spotlight on the data quality issues and lead to improvements for the future. It is, therefore, usually necessary to go through the data entered into the data warehouse and make it as error free as possible. This process is known as Data Cleansing. Data Cleansing must deal with many types of possible errors. These include missing data and incorrect data at one source; inconsistent data and conflicting data when two or more source are involved. There are several algorithms followed to clean the data, which will be discussed in the coming lecture notes.
LOADING

Loading often implies physical movement of the data from the computer(s) storing the source database(s) to that which will store the data warehouse database, assuming it is different. This takes place immediately after the extraction phase. The most common channel for data movement is a high-speed communication link. Ex: Oracle Warehouse Builder is the API from Oracle, which provides the features to perform the ETL task on Oracle Data Warehouse.

Executive Support Systems(ESS)


supply the necessary tools to senior management. The decisions at this level of the company are usually never structured and could be described as "educated guesses." Executives rely as much, if not more so, on external data than they do on data internal to their organization. Decisions must be made in the context of the world outside the organization. The problems and situations senior executives face are very fluid, always changing, so the system must be flexible and easy to manipulate. The Role of ESS in the Organization Executives often face information overload and must be able to separate the chaff from the wheat in order to make the right decision. On the other hand, if the information they have is not detailed enough they may not be able to make the best decision. An ESS can supply the summarized information executives need and yet provide the opportunity to drill down to more detail if necessary. As technology advances, ESS are able to link data from various sources both internal and external to provide the amount and kind of information executives find useful. As common software programs include more options and executives gain experience using these programs, they're turning to them as an easy way to manipulate information. Manyexecutives are also turning to the Web to provide the flexibility they need. Benefits of ESS As more executives come up through the ranks, they are more familiar with and rely more on technology to assist them with their jobs. Executive Support Systems don't provide executives with ready- made decisions. They provide the information that helps them make their decisions. Executives use that information, along with their experience, knowledge, education, and understanding of the corporation and the business environment as a whole, to make their decisions. Executives are more inclined to want summarized data rather than detailed data (even though the details must be available). ESS rely on graphic presentation of information because it's a much quicker way for busy executives to grasp summarized information

.. Simple for high-level executives to use Operations do not require extensive computer
experience Provides timely delivery of companysummary information Provides better understanding of information Filters data for better time management Provides system for improvement ininformation tracking Disadvantages Computer skills required to obtain results Requires preparation and analysis time to get desired information Detail oriented Provides detailed analysis of a situation Difficult to quantify benefits of DSS How doy ou quantify a better decision?

Difficult to maintain database integrity Provides only moderate support of external data and graphics capabilities

Characteristics of ESS Degree of use High, consistent, without need of technical assistance Computer skills required Very low -must be easy to learn and use Flexibility High - must fit executive decision making style Principle use Tracking, control Decisions supported Upper level management, unstructured Data supported Company internal and external Output capabilities Text, tabular, graphical, trend toward audio/video in future Graphic concentration High, presentation style Data access speed Must be high, fast response

Executive Information Systems

\
An Executive Information Systems (EIS) is a type of management information system intended to facilitate and support the information and decision making needs of senior executives by providing easy access to both internal and external information relevant to meeting the strategic goals of the organization. It is commonly considered as a specialized form of a Decision Support System (DSS) and otherwise referred to as an Executive Support System (ESS). EIS are targeted at management needs to quickly assess the status of a business or section of business. These packages are aimed firmly at the type of business user who needs instant and up to date understanding of critical business information to aid decision making. The idea behind an EIS is that information can be collated and displayed to the user without manipulation or further processing. The user can then quickly see the status of his chosen department or function, enabling them to concentrate on decision making. Generally an EIS is configured to display data such as order backlogs, open sales, purchase order backlogs, shipments, receipts and pending orders. This information can then be used to make executive decisions at a strategic level. The emphasis of the system as a whole is the easy to use interface and the integration with a variety of data sources. It offers strong reporting and data mining capabilities which can provide all the data the executive is likely to need. Traditionally the interface was menu driven with either reports, or text presentation. Newer systems, and especially the newer Business Intelligence systems, which are replacing EIS, have a dashboard or scorecard type display.

The Role of ESS in the Organization

Executives often face information overload and must be able to separate the chaff from the wheat in order to make the right decision. On the other hand, if the information they have is not detailed enough they may not be able to make the best decision. An ESS can supply the summarized information executives need and yet provide the opportunity to drill down to more detail if necessary. As technology advances, ESS are able to link data from various sources both internal and external to provide the amount and kind of information executives find useful. As common software programs include more options and executives gain experience using these programs, they're turning to them as an easy way to manipulate information. Many executives are also turning to the Web to provide the flexibility they need.

Types of Executive Information System Corporate Management - responsible for business and fiscal planning, budgetary control,

as well as for ensuring the corporate information technology needs are met in a co-ordinated and cost effective manner. E.g., Management functions, human resources, financial data, correspondence, performance measures, etc. (whatever is interesting to executives) Technical Information Dissemination for the purpose of disseminating the latest information on relevant technologies, products, processes and markets E.g., Energy, environment, aerospace, weather, etc.

Executive Information System Components The components of an EIS can typically be classified as:

Hardware: Software:

1. 2. 3.

Text base software. The most common form of text is probably documents; Database. Heterogeneous databases residing on a range of vendor-specific and open computer platforms help executives access both internal and external data; Graphic base. Graphics can turn volumes of text and statistics into visual information for executives. Typical graphic types are: time series charts, scatter diagrams, maps, motion graphics, sequence charts, and comparison-oriented graphs (i.e., bar charts); Model base. The executive information system models contain routine and special statistical, financial, and other quantitative analysis. Interface: An EIS needs to be efficient to retrieve relevant data for decision makers, so the interface is very important. Several types of interfaces can be available to the EIS structure, such as scheduled reports, questions & answers, menu driven, command language, natural language, and input & output. Telecommunication Executive Support Systems Characteristics.

4.

Executive support systems are end-user computerised information systems operated directly by executive managers. They utilise newer computer technology in the form of data sources, hardware and programs, to place data in a common format, and provide fast and easy access to information. They integrate data from a variety of sources both internal and external to the organisation. They focus on helping executives assimilate information quickly to identify problems and opportunities. In other words, EISs help executives track their critical success factors. Each system is tailored to the needs and preferences of an individual user, and information is presented in a format which can most readily be interpreted.

Capabilities of Executive Support Systems Most executive support systems offer the following capabilities: Consolidation involves the aggregation of information and features simple roll-ups to complex groupings of interrelated information Drill-down enables users to get details, and details of details, of information Slice-and-dice looks at information from different perspectives Digital dashboard integrates information from multiple components and presents it in a unified display. Advantages of Executive Information System As more executives come up through the ranks, they are more familiar with and rely more on technology to assist them with their jobs. Executive Support Systems don't provide executives with ready- made decisions. They provide the information that helps them make their decisions. Executives use that information, along with their experience, knowledge, education, and understanding of the corporation and the business environment as a whole, to make their decisions. Executives are more inclined to want summarized data rather than detailed data (even though the details must be available). ESS rely on graphic presentation of information because it's a much quicker way for busy executives to grasp summarized information It provides timely delivery of company summary information. It provides better understanding of information It filters data for management. It provides system for improvement in information tracking It offers efficiency to decision makers. Disadvantages of Executive Information System Functions are limited, cannot perform complex calculations. Hard to quantify benefits and to justify implementation of an EIS. Executives may encounter information overload. System may become slow, large, and hard to manage. Difficult to keep current data. May lead to less reliable and insecure data. Small companies may encounter excessive costs for implementation. Highly skilled personnel requirement can not be fulfilled by the small business.

Executive Information System Features They provide summary information to monitoring of business performance. This is often achieved through measures known as critical success factors or key performance indicators (KPIs). These will be displayed in an easy-to-interpret form such as a graph showing their variation through time. If a KPI falls below a critical preset value, the system will notify the manager through a visible or audible warning. They are used mainly for strategic decision making, but may also provide features that relate to

tactical decision making. They provide a drill-down feature which gives a manager the opportunity to find out more information necessary to take a decision or discover the source of a problem. E.g. a manager with multinational manufacturing problem might find from the EIS that a particular country is underperforming in production. He could drill down to see which particular factory was responsible for this. They provide analysis tools. They must be integrated with other facilities to help manage the solving of problems and the daily running of the business. These include electronic mail and scheduling and calendar facilities. They integrate data from a wide range of information sources, including company and external sources such as market and competitor. They have to be designed according to the needs of managers who do not use computers frequently. They should be intuitive and easy to learn. All these facilities require integration with operational data. Since this information is commonly stored in the ERP systems, these are often integrated with EIS or have EIS functions built in.

An expert system, is an interactive computer-based decision tool that uses both facts and heuristics to solve difficult decision making problems, based on knowledge acquired from an expert. An expert system is a model and associated procedure that exhibits, within a specific domain, a degree of expertise in problem solving that is comparable to that of a human expert. An expert system compared with traditional computer : Inference engine + Knowledge = Expert system = Program in traditional computer )

( Algorithm + Data structures

First expert system, called DENDRAL, was developed in the early 70's at Stanford University.
Expert systems non-algorithmic are computer applications expertise for solving certain which embody some types of problems. For

example, expert systems are used in diagnostic applications. They also play chess, make financial planning decisions, configure computers, monitor

real

time

systems,

underwrite

insurance policies,

and

perform

many

services which previously required human expertise.

Components and Interfaces


Knowledge base : Working storage :

A declarative representation of the expertise;

often in IF THEN rules ;


The data which is specific to a problem being

solved;
Inference engine :

The code at the core of the system which

derives recommendations from the knowledge base and problemspecific data in working storage;
User interface : The

code that controls the dialog between the

user and the system. Roles of Individuals who interact with the system
Domain expert :

The individuals who currently are experts in

solving the problems; here the system is intended to solve;


Knowledge engineer : The

individual

who

encodes

the

expert's

knowledge in a declarative form that can be used by the expert

system;
User :

The individual who will be consulting with the system to

get advice which would have been provided by the expert.


Knowledge engineer : uses the shell to build a system for a

particular problem domain.


System engineer : builds

the user interface, designs

the

declarative format of the knowledge base, and implements the inference engine.
EXPERT SYSTEM CHARACATERISTICS
5. Operates as an interactive system

This means an expert system : Responds to questions Asks for clarifications Makes recommendations Aids the decision-making process.
Tools have ability to sift (filter) knowledge

Storage and retrieval of knowledge Mechanisms to expand and update knowledge base on a continuing basis.
Make logical inferences based on knowledge stored

Simple reasoning mechanisms is used Knowledge base must have means of exploiting the knowledge stored, else it is useless; e.g., learning all the words in a language,
without knowing how to combine those words to form a meaningful sentence.

Ability to explain reasoning

Remembers logical chain of reasoning; therefore user may ask for explanation of a recommendation

factors considered in recommendation Enhances user confidence in recommendation and acceptance of expert system

Domain-Specific A particular system caters a narrow area of specialization; e.g., a medical expert an electrical circuit. Quality of advice offered by an expert the amount of knowledge stored.

Capability to assign Confidence Values

Can deliver quantitative information Can interpret qualitatively derived values Can address imprecise and incomplete of confidence values.
Applications

Best suited for those dealing with problems.

expert heuristics for solving

Not a suitable choice for those problems that can be solved using purely numerical techniques.
Cost-Effective alternative to Human Expert

Expert

systems

have

become

increasingly popular

because

of

their specialization, albeit in a narrow field. Encoding and storing the domain-specific knowledge is economic process due to small size. Specialists in many areas are rare and the cost of consulting them is high; an expert system of those areas can be useful and cost-effective alternative in the long run.

ARTIFICIAL INTELLIGENCE
Artificial Intelligence is a branch of Science which deals with helping machines find solutions to complex problems in a more human-like fashion. This generally involves borrowing characteristics from human intelligence, and applying them as algorithms in a computer friendly way. A more or less flexible or efficient approach can be taken depending on the requirements

Technology...
There are many different approaches to Artificial Intelligence, none of which are either completely right or wrong. Some are obviously more suited than others in some cases, but any working alternative can be defended. Over the years, trends have emerged based on the state of mind of influencial researchers, funding opportunities as well as available computer hardware. Over the past five decades, AI research has mostly been focusing on solving specific problems. Numerous solutions have been devised and improved to do so efficiently and reliably. This explains why the field of Artificial Intelligence is split into many branches, ranging from Pattern Recognition to Artificial Life, including Evolutionary Computation and Planning.

Applications...
The potential applications of Artificial Intelligence are abundant. They stretch from the military for autonomous control and target identification, to the entertainment industry for computer games and robotic pets. Lets also not forget big establishments dealing with huge amounts of information such as hospitals, banks and insurances, who can use AI to predict customer behaviour and detect trends. As you may expect, the business of Artificial Intelligence is becoming one of the major driving forces for research. With an ever growing market to satisfy, there's plenty of room for more personel. So if you know what you're doing, there's plenty of money to be made from interested big companies! BRANCHES OF AI logical AI What a program knows about the world in general the facts of the specific situation in which it must act, and its goals are all represented by sentences of some mathematical logical language. The program decides what to do by inferring that certain actions are appropriate for achieving its goals. search AI programs often examine large numbers of possibilities, e.g. moves in a chess game or inferences by a theorem proving program. Discoveries are continually made about how to do this more efficiently in various domains.

pattern recognition When a program makes observations of some kind, it is often programmed to compare what it sees with a pattern. For example, a vision program may try to match a pattern of eyes and a nose in a scene in order to find a face. representation Facts about the world have to be represented in some way. Usually languages of mathematical logic are used. inference From some facts, others can be inferred. Mathematical logical deduction is adequate for some purposes, but new methods of non-monotonic inference have been added to logic since the 1970s. The simplest kind of non-monotonic reasoning is default reasoning in which a conclusion is to be inferred by default, but the conclusion can be withdrawn if there is evidence to the contrary. For example, when we hear of a bird, we man infer that it can fly, but this conclusion can be reversed when we hear that it is a penguin. It is the possibility that a conclusion may have to be withdrawn that constitutes the nonmonotonic character of the reasoning. Ordinary logical reasoning is monotonic in that the set of conclusions that can the drawn from a set of premises is a monotonic increasing function of the premises. Circumscription is another form of non-monotonic reasoning. common sense knowledge and reasoning This is the area in which AI is farthest from human-level, in spite of the fact that it has been an active research area since the 1950s. While there has been considerable progress, e.g. in developing systems of non-monotonic reasoning and theories of action, yet more new ideas are needed. The Cyc system contains a large but spotty collection of common sense facts. learning from experience Programs do that. The approaches to AI based on connectionism and neural nets specialize in that. There is also learning of laws expressed in logic planning Planning programs start with general facts about the world (especially facts about the effects of actions), facts about the particular situation and a statement of a goal. From these, they generate a strategy for achieving the goal. In the most common cases, the strategy is just a sequence of actions. epistemology This is a study of the kinds of knowledge that are required for solving problems in the world. ontology Ontology is the study of the kinds of things that exist. In AI, the programs and sentences deal with various kinds of objects, and we study what these kinds are and what their basic properties are. Emphasis on ontology begins in the 1990s. heuristics A heuristic is a way of trying to discover something or an idea imbedded in a program. The term is used variously in AI. Heuristic functions are used in some approaches to search to measure how far a node in a search tree seems to be from a goal.

Genetic programming Genetic programming is a technique for getting programs to solve a task by mating random Lisp programs and selecting fittest in millions of generations. It is being developed by John Koza's group

APPLICATIONS OF AI
game playing You can buy machines that can play master level chess for a few hundred dollars. There is some AI in them, but they play well against people mainly through brute force computation--looking at hundreds of thousands of positions. To beat a world champion by brute force and known reliable heuristics requires being able to look at 200 million positions per second. speech recognition In the 1990s, computer speech recognition reached a practical level for limited purposes. Thus United Airlines has replaced its keyboard tree for flight information by a system using speech recognition of flight numbers and city names. It is quite convenient. On the the other hand, while it is possible to instruct some computers using speech, most users have gone back to the keyboard and the mouse as still more convenient. understanding natural language Just getting a sequence of words into a computer is not enough. Parsing sentences is not enough either. The computer has to be provided with an understanding of the domain the text is about, and this is presently possible only for very limited domains. computer vision The world is composed of three-dimensional objects, but the inputs to the human eye and computers' TV cameras are two dimensional. Some useful programs can work solely in two dimensions, but full computer vision requires partial three-dimensional information that is not just a set of two-dimensional views. At present there are only limited ways of representing three-dimensional information directly, and they are not as good as what humans evidently use. expert systems A ``knowledge engineer'' interviews experts in a certain domain and tries to embody their knowledge in a computer program for carrying out some task. How well this works depends on whether the intellectual mechanisms required for the task are within the present state of AI. When this turned out not to be so, there were many disappointing results. One of the first expert systems was MYCIN in 1974, which diagnosed bacterial infections of the blood and suggested treatments. heuristic classification One of the most feasible kinds of expert system given the present knowledge of AI is to put some information in one of a fixed set of categories using several sources of information. An example is advising whether to accept a proposed credit card purchase. Information is available about the owner of the credit card, his record of payment and also about the item he is buying and about the establishment from which he is buying it (e.g., about whether there have been previous credit card frauds at this establishment).

Group Decision Support Systems (GDSS) are a class of electronic meeting systems, a collaboration [1] technology designed to support meetings and group work . GDSS are distinct from computer supported cooperative work (CSCW) technologies as GDSS are more focused on task support, whereas CSCW tools provide general communication support . Group Decision Support Systems are categorized within a time-place paradigm. The benefits, or process gains, from using a GDSS (over more traditional group techniques) are: More precise communication; Synergy: members are empowered to build on ideas of others; More objective evaluation of ideas; Stimulation of individuals to increase participation; Learning: group members imitate and learn from successful behaviors of others. The costs, or process losses, from using a GDSS (instead of more traditional group techniques) are: More free riding; More information overload; More flaming; Slower feedback; Fewer information cues; Incomplete use of information. GDSS over traditional group techniques limited or reduced the following process losses: Less attention blocking Less conformance pressure Less airtime fragmentation Less attenuation blocking Less socializing Less individual domination

Group decision support systems (GDSSs), a subclass of DSSs, are defined as information technologybased support systems that provide decision-making support to groups[24]. They refer to the systems that provide computer-based aids and communication support for decision-making meetings in organizations. The group meeting is a joint activity in which a group of people is engaged with equal or near-equal status. The activity and its outputs are intellectual in nature. Essentially, the outputs of the meeting depend on the knowledge and judgment contributed by the participants. Differences in opinion may be settled by negotiation or arbitration. Components of GDSS The difference between GDSSs and DSSs is the focus on the group versus the individual decision-maker. The components of a GDSS are basically similar to those of DSS, including hardware, software, and people; but in addition, within the collaborative environment, communication and networking technologies are added for group participation from different sites. Moreover, compared with DSSs, GDSSs designers pay more attention to the user/system interface with multi-user access and system reliability because a system failure will affect a multi-user group, rather than just an individual. There are three fundamental types of components that compose GDSSs: 1. Software The software part may consist of the following components: databases and database management capabilities, user/system interface with multi-user access, specific applications to facilitate group decision-makers activities, and modeling capabilities.

2. Hardware he hardware part may consist of the following components: I/O devices, PCs or workstations, individual monitors for each participant or a public screen for group, and a network to link participants to each other. 3. People he people may include decision-making participants and /or facilitator. A facilitator is a person who directs the group through the planning process. Benefits claimed for GDSS There are three benefits claimed for GDSSs: increased efficiency, improved quality, and leverage that improves the way meetings run. Due to increasing computer data processing power, communication and network performance, the speed and quality for information processing and information transmission create the opportunity for higher efficiency. Efficiency achievement depends on the performance of hardware (e.g., PCs, LAN/WAN) and software. With regard to the software aspect of GDSSs, the software architecture with database management and an interactive interface affects system run time efficiency and performance. Improved quality of the outcomes of a group meeting implies the increased quality of alternatives examined, greater participation and contribution from people who would otherwise be silent, or decision outcomes judged to be of higher quality. In a GDSS, the outcome of a meeting or decision-making process depends on communication facilities and decision support facilities. Those facilities can help decision-making participants avoid the constraints imposed by geography. They also make information sharable and reduce effort in the decision-making process. Therefore, those facilities contribute to meeting quality improvement. Leverage implies that the system does not merely speed up the process (say efficiency), but changes it fundamentally. In other words, leverage can be achieved through providing better ways of meeting, such as providing the ability to execute multiple tasks at the same time. Factors that affect GDSS There are usually several factors affecting GDSSs, Anonymity Facility design Multiple public screens Knowledge bases and databases Communication network speed Fixed versus customized methodology Software design Group size and composition Satisfaction Information needs of groups It is fundamental and important to clearly understand what groups do and which of their activities and procedures can be and should be supported by GDSSs. Also, it is necessary to know the information needs of groups and examine how best to support these information uses with GDSSs. The information needs of groups covers a broad spectrum. Database access Databases are one of the basic components of GDSSs. GDSSs offer groups the advantage of accessing databases or some on-line service for the latest information. The databases can be internal or external

databases. This is a key element in information retrieval and sharing in a group meeting. The requirements on the presentation and functions of the obtained information can be summarized as follows: information should be presented in clear and familiar ways; information presentation and all other associated management control aspects should assist the decision-maker to guide the process of judgment and choice; with an explanation facility, information containing an advice or decision suggestion enables users to know how and why results and advice are obtained; and information should be helpful to improve the precision of task situation understanding. Moreover, information needs are based on the identification of the information requirements for the particular situation. Information creation In addition to a decision, the output of the meeting is new information. In a GDSS, all input into the computer is usually captured. In some cases, the actions of individual members of the group are stored in a database, file or some other storage format. Making a decision is not a point-event. The decision is produced based on valuable knowledge. It is worthwhile to save the valuable information in efficient ways which make it convenient for further use. Dissemination of information, decisions, and responsibilities An often-cited advantage of GDSSs is that the participants are allowed to know what new information was created, what decision was reached, and who is responsible for follow-up or for implementing decisions. On-line modeling On-line modeling is the next step beyond sharing existing data. For example, the participants can perform on-line analysis and send out their results or ideas to a public board. Visual decision-making Some decisions involve visuals rather than words or numbers. Intuitively, graphics with shape, size, and color, might make it easier and faster for users to have an overall view of the information. Multimedia information presentation The combination of visible and audible information presentation format impacts the traditional information presentation format. The benefits of multimedia presentation include better interaction, more straightforward and effective communication in the group, and decreased learning time. Idea generation A variety of idea generation packages or methods exist for GDSSs use. Voting This implies the ability to vote, rank, or rate GDSSs have an impact on the work of individuals, groups, and organizations. In general, the performance improvement and satisfaction of individuals will lead to the improvement of the group. Both hardware and software will influence GDSSs. For example, the performance of a network will directly affect data transmission. If the network slows down, it will constrain the GDSSs capability of on-line data processing. Video and audio devices are adopted to make it more straightforward for users to recognize multimedia information, which results in the improvement of efficiency and in effectiveness, as well as in the quality of meeting outcomes. Hardware development and innovation are significant for GDSSs performance. Software is another factor that has an impact on GDSSs performance. Software and hardware interact, and, to a certain extent, trade off performance. Because of either software or hardware, the performance can be enhanced or inhibited depending on the target environment.

Vous aimerez peut-être aussi