Skip to main content

Posts

Showing posts from September, 2011

Data Warehousing Tools

The key general categories of data warehousing tools are: Spreadsheets Reporting and querying software: tools that extract, sort, summarize, and present selected data OLAP: Online analytical processing Digital Dashboards Data mining Decision engineering Process mining Business performance management Local information systems Except for spreadsheets, these tools are sold as standalone tools, suites of tools, components of ERP systems, or as components of software targeted to a specific industry. The tools are sometimes packaged into data warehouse appliances. Open source free products Eclipse BIRT Project JasperSoft Pentaho Community Edition RapidMiner SpagoBI R Open source commercial products Palo (OLAP database) : OLAP Server, Worksheet Server and ETL Server Pentaho : Reporting, analysis, dashboard, data mining and workflow capabilities Proprietary free products InetSoft MicroStrategy MicroStrategy Reporting Suite Proprietary products ActiveR

The Data Warehousing Process

Stage 1 : Determine Informational Requirements • Identify and analyze existing informational capabilities. • Identify from key users the significant business questions and key metrics that the target user.Group regards as their most important requirements for information. • Decompose these metrics into their component parts with specific definitions. • Map the component parts to the informational model and systems of record. Stage 2 : Evolutionary and Iterative Development Process When you begin to develop your first data warehouse increment, the architecture is new and fresh. With the second and subsequent increments, the following is true: • Start with one subject area (or subset or superset) and one target user group. • Continue and add subject areas, user groups and informational capabilities to the architecture based on the organization’s requirements for information, not technology. • Improvements are made from what was learned from previous increments. • Impro

HOW IS THE WAREHOUSE DIFFERENT?

 The data warehouse is distinctly different from the operational data used and maintained by day-to-day operational systems. Data warehousing is not simply an “access wrapper” for operational data, where data is simply “dumped” into tables for direct access. Among the differences:  Comparison of operational systems and data warehousing systems operational systems data warehousing systems Operational systems are generally designed to support high-volume transaction processing with minimal back-end reporting. Data warehousing systems are generally designed to support high-volume analytical processing (i.e. OLAP ) and subsequent, often elaborate report generation . Operational systems are generally process-oriented or process-driven , meaning that they are focused on specific business processes or tasks. Example tasks include billing, registration, etc. Data warehousing systems are generally subject-oriented , organized around business areas that

Data Warehouse-Concepts

A fundamental concept of a data warehouse is the distinction between data and information. Data is composed of observable and recordable facts that are often found in operational or transactional systems.At Rutgers, these systems include the registrar’s data on students (widely known as the SRDB), human resource and payroll databases, course scheduling data, and data on financial aid. In a data warehouse environment, data only comes to have value to end-users when it is organized and presented as information. Information is an integrated collection of facts and is used as the basis for decisionmaking. For example, an academic unit needs to have diachronic information about its extent of instructional output of its different faculty members to gauge if it is becoming more or less reliant on part-time faculty. The data warehouse is that portion of an overall Architected Data Environment that serves as the single integrated source of data for processing information. The data w