Business Intelligence – Microsoft Press: “Business Intelligence”
bought the paper book through Amazon as a used book, purchased the PDF at the o’Reilly’s web-site.
Update 2011-08-08 : dead tree book arrived at my favourite DHL Packstation 122 nearby.
Glossary
80/20 rule A theory invented by Vilfredo Pareto in the late 1800s, also
known as the Pareto principle, that describes the percentage imbalance
between input and output. The Pareto principle is not a law of science;
rather, it is a rule of thumb that can apply to many aspects of life. One of the
most common business examples is when 80 percent of a company’s revenue
comes from 20 percent of its customers.
actionability A criterion used to grade the importance of a BI opportunity
area based on its prospects of empowering people to take action in an organization. Actionability ratings are high, medium, and low.
ad hoc analysis The impromptu and flexible examination of data without
predefined or fixed formats. Ad hoc analysis gives users the ability to ask and
get answers to an infinite variety of questions quickly.
affinity grouping A descriptive data mining task that describes which items
go together based on a set of characteristics.
alternate hierarchy A different grouping of levels in a dimension. A dimension can have several alternate hierarchies to meet various analysis needs.
analysis gap A gap between the information that decision makers require
and the mountains of data that businesses collect every day.
ancestor Any member of a dimension at any higher level in relation to
another member of the same dimension.
base measure A measure that is captured at the transaction level in an
operational system.
benchmark A measure used for making comparisons, for example, industry-specific ratios such as a price/earnings ratio.
BI See business intelligence.
BI cycle A performance management framework; an ongoing cycle by
which companies set their goals,
80/20 rule A theory invented by Vilfredo Pareto in the late 1800s, also
known as the Pareto principle, that describes the percentage imbalance
between input and output. The Pareto principle is not a law of science;
rather, it is a rule of thumb that can apply to many aspects of life. One of the
most common business examples is when 80 percent of a company’s revenue
comes from 20 percent of its customers.
actionability A criterion used to grade the importance of a BI opportunity
area based on its prospects of empowering people to take action in an organization. Actionability ratings are high, medium, and low.
ad hoc analysis The impromptu and flexible examination of data without
predefined or fixed formats. Ad hoc analysis gives users the ability to ask and
get answers to an infinite variety of questions quickly.
affinity grouping A descriptive data mining task that describes which items
go together based on a set of characteristics.
alternate hierarchy A different grouping of levels in a dimension. A dimension can have several alternate hierarchies to meet various analysis needs.
analysis gap A gap between the information that decision makers require
and the mountains of data that businesses collect every day.
ancestor Any member of a dimension at any higher level in relation to
another member of the same dimension.
base measure A measure that is captured at the transaction level in an
operational system.
benchmark A measure used for making comparisons, for example, industry-specific ratios such as a price/earnings ratio.
BI See business intelligence.
BI cycle A performance management framework; an ongoing cycle by
which companies set their goals,
- analyze their progress,
- gain insight,
- take
action, - measure their success,
- and start all over again.
BI solution A mechanism that brings together people, technology, and data
to deliver valuable information to business users.
blueprint A table that documents the measures and dimensions for answering business questions and reflects the most fundamental requirements for
building BI solutions.
business intelligence (BI) An approach to management that allows an
organization to define what information is useful and relevant to its corporate decision making. Business intelligence is a multifaceted concept that
empowers organizations to make better decisions faster, convert data into
information, and use a rational approach to management.
business reporting and analysis process A subset of processes responsible
for taking data from a BI system, such as a data warehouse, assembling it
into a business-friendly format, and delivering data to business users.
business-to-business (B2B) The exchange of products, services, or information between businesses.
business-to-consumer (B2C) The exchange of products, services, or information between businesses and consumers.
business unit An organizational structure in which a coherent set of functional activities rolls up into one line of business.
calculated measure A measure that is calculated or derived from a combination of base measures.
child A member that is directly subordinate to another member in a
hierarchy.
classification A predictive data mining task that assigns records to specific
categories according to the rules of a data mining model.
click-stream analysis The analysis of a user’s interaction with a Web site by
investigating the data that is generated with each user’s click in a Web
browser. The goal of click-stream analysis is to understand the behavior of
Web site visitors, identify their likes and dislikes, and use this information to
improve the quality of the Web site.
closed-loop analysis A process that allows end users to act on the outcomes of their analyses to automatically drive business processes.
clustering A descriptive data mining task that divides data into small
groups based on similarity without predefinition of the data groups.
cube A multidimensional data structure that represents the intersections of
each unique combination of dimensions. At each intersection there is a cell
that contains a data value.
building BI solutions.
business intelligence (BI) An approach to management that allows an
organization to define what information is useful and relevant to its corporate decision making. Business intelligence is a multifaceted concept that
empowers organizations to make better decisions faster, convert data into
information, and use a rational approach to management.
business reporting and analysis process A subset of processes responsible
for taking data from a BI system, such as a data warehouse, assembling it
into a business-friendly format, and delivering data to business users.
business-to-business (B2B) The exchange of products, services, or information between businesses.
business-to-consumer (B2C) The exchange of products, services, or information between businesses and consumers.
business unit An organizational structure in which a coherent set of functional activities rolls up into one line of business.
calculated measure A measure that is calculated or derived from a combination of base measures.
child A member that is directly subordinate to another member in a
hierarchy.
classification A predictive data mining task that assigns records to specific
categories according to the rules of a data mining model.
click-stream analysis The analysis of a user’s interaction with a Web site by
investigating the data that is generated with each user’s click in a Web
browser. The goal of click-stream analysis is to understand the behavior of
Web site visitors, identify their likes and dislikes, and use this information to
improve the quality of the Web site.
closed-loop analysis A process that allows end users to act on the outcomes of their analyses to automatically drive business processes.
clustering A descriptive data mining task that divides data into small
groups based on similarity without predefinition of the data groups.
cube A multidimensional data structure that represents the intersections of
each unique combination of dimensions. At each intersection there is a cell
that contains a data value.
Glossary
custom aggregation A method of summarizing data from its lowest level of
detail to its highest level of detail in which measures are aggregated differently across different levels of a dimension.
database A collection of related data that is organized in a useful manner
for easy retrieval. There are different applications of databases depending on
the type of data to be stored and how the data is to be used.
data modelers Specialists who work with businesspeople and the technical
experts during the implementation of a BI solution. Data modelers are
responsible for gathering business requirements and translating these
requirements into a realistic design of dimensions and measures.
data mining An automated process that uses a variety of analysis tools and
statistical techniques to reveal actionable patterns and relationships in large,
complex data sets.
data mart A collection of data that is structured in a way to facilitate analysis. Data marts support the study of a single subject area, with all relevant
data from all operational applications brought together into that data mart.
Data marts may be of the relational (RDBMS) variety or the OLAP variety
depending on the type of analysis to be performed.
data warehouse A repository for data. Many experts define the data warehouse as a centralized data store that feeds data into a series of subject-specific data stores—called data marts. Others accept a broader definition of
the data warehouse as a collection of integrated data marts.
descendant Any member at any lower level in relation to another specific
member.
decision tree A model for breaking data into groups. A decision tree uses a
statistical algorithm to split the set of data being mined into branches of a tree.
descriptive data mining A form of data mining that produces a model to
describe patterns in historical data and requires human interaction to determine the significance and meaning of these patterns.
desktop online analytical processing (DOLAP) An OLAP storage mode
that keeps data on a client’s machine and provides local multidimensional
analysis.
dimension A categorically consistent view of data. All members of a dimension belong together as a group.
detail to its highest level of detail in which measures are aggregated differently across different levels of a dimension.
database A collection of related data that is organized in a useful manner
for easy retrieval. There are different applications of databases depending on
the type of data to be stored and how the data is to be used.
data modelers Specialists who work with businesspeople and the technical
experts during the implementation of a BI solution. Data modelers are
responsible for gathering business requirements and translating these
requirements into a realistic design of dimensions and measures.
data mining An automated process that uses a variety of analysis tools and
statistical techniques to reveal actionable patterns and relationships in large,
complex data sets.
data mart A collection of data that is structured in a way to facilitate analysis. Data marts support the study of a single subject area, with all relevant
data from all operational applications brought together into that data mart.
Data marts may be of the relational (RDBMS) variety or the OLAP variety
depending on the type of analysis to be performed.
data warehouse A repository for data. Many experts define the data warehouse as a centralized data store that feeds data into a series of subject-specific data stores—called data marts. Others accept a broader definition of
the data warehouse as a collection of integrated data marts.
descendant Any member at any lower level in relation to another specific
member.
decision tree A model for breaking data into groups. A decision tree uses a
statistical algorithm to split the set of data being mined into branches of a tree.
descriptive data mining A form of data mining that produces a model to
describe patterns in historical data and requires human interaction to determine the significance and meaning of these patterns.
desktop online analytical processing (DOLAP) An OLAP storage mode
that keeps data on a client’s machine and provides local multidimensional
analysis.
dimension A categorically consistent view of data. All members of a dimension belong together as a group.
dirty data Data that is uncleansed or invalid because it is missing, incorrect, or duplicated.
DOLAP See desktop online analytical processing (DOLAP).
EDI See electronic data interchange (EDI).
electronic data interchange (EDI) A standard for the electronic exchange
of business data.
enterprise resource planning (ERP) system A business management system that integrates all facets of the business, including planning, manufacturing, sales, and marketing. ERP systems are most often implemented using
packaged software applications that support each facet of the business.
ERP system See enterprise resource planning (ERP) system.
estimation A predictive data mining task used to assign a new record with a predicted value according to the rules of a data mining model.
ETL See extract, transform, and load (ETL) processes.
Extensible Markup Language for Analysis (XML/A) A standard protocol
that OLAP clients can use to talk to OLAP servers. XML/A is based on the
widely adopted XML (Extensible Markup Language) standard and uses the
programming language Multidimensional Expressions (MDX).
extract, transform, and load (ETL) processes Processes that are responsible for transporting and integrating data from one or more source systems
into one or more destination systems.
front-end tool A category of software that harvests the data stored in a data
warehouse and presents the data to users in the form of reports and interactive reviews.
functional area A department of a business unit that is focused on a specific function.
hierarchy The organization of levels within a dimension that (1) reflects
how data is aggregated from detailed levels to summarized levels and
(2) serves as the drill-down path for top-down business analysis.
HOLAP See hybrid online analytical processing (HOLAP).
hybrid online analytical processing (HOLAP) An OLAP tool that can store
data in both multidimensional databases and relational databases.
DOLAP See desktop online analytical processing (DOLAP).
EDI See electronic data interchange (EDI).
electronic data interchange (EDI) A standard for the electronic exchange
of business data.
enterprise resource planning (ERP) system A business management system that integrates all facets of the business, including planning, manufacturing, sales, and marketing. ERP systems are most often implemented using
packaged software applications that support each facet of the business.
ERP system See enterprise resource planning (ERP) system.
estimation A predictive data mining task used to assign a new record with a predicted value according to the rules of a data mining model.
ETL See extract, transform, and load (ETL) processes.
Extensible Markup Language for Analysis (XML/A) A standard protocol
that OLAP clients can use to talk to OLAP servers. XML/A is based on the
widely adopted XML (Extensible Markup Language) standard and uses the
programming language Multidimensional Expressions (MDX).
extract, transform, and load (ETL) processes Processes that are responsible for transporting and integrating data from one or more source systems
into one or more destination systems.
front-end tool A category of software that harvests the data stored in a data
warehouse and presents the data to users in the form of reports and interactive reviews.
functional area A department of a business unit that is focused on a specific function.
hierarchy The organization of levels within a dimension that (1) reflects
how data is aggregated from detailed levels to summarized levels and
(2) serves as the drill-down path for top-down business analysis.
HOLAP See hybrid online analytical processing (HOLAP).
hybrid online analytical processing (HOLAP) An OLAP tool that can store
data in both multidimensional databases and relational databases.
information consumer A community of business users that requires the
ability to dynamically query the database via a “guided” user experience that
allows drill down and pivoting when desired, while eliminating options that
may create undesirable results.
ability to dynamically query the database via a “guided” user experience that
allows drill down and pivoting when desired, while eliminating options that
may create undesirable results.
information user A community of business users that generally requires
standard reports without needing to analyze the data on an ad hoc basis.
interoperability A product’s ability to work together and interact with other
products.
key performance indicator (KPI) A measure that ranks as one of the most
important metrics in an organization. KPIs guide businesses in making decisions that affect particular business units as well as the company at large.
Key performance indicator is used interchangeably with metric.
KPI See key performance indicator (KPI).
leaf member A bottom-level member in a dimension.
materiality A criterion used to grade the importance of an BI opportunity
area based on how financially significant the opportunity is to the organization. Materiality ratings are high, medium, and low.
measure A numeric value that is of interest to business analysis.
member An item in a dimension that represents one or more occurrences of data.
mental model A collection of everything that we think we know about how
something works (in this case our business). This labeling of our understanding applies to not only people but also organizations. Some people refer
to the company’s mental model as “tribal wisdom.”
metadata Information about the properties of data, such as business logic
that describes the structure and content of dimensions and measures.
metric A measure that guides businesses in making decisions that affect
particular business units as well as the company at large. Metric is used
interchangeably with key performance indicator.
MOLAP See multidimensional online analytical processing (MOLAP).
multidimensional analysis A way of analyzing data in a top-down
fashion by examining measures simultaneously broken out by multiple
dimensions.
standard reports without needing to analyze the data on an ad hoc basis.
interoperability A product’s ability to work together and interact with other
products.
key performance indicator (KPI) A measure that ranks as one of the most
important metrics in an organization. KPIs guide businesses in making decisions that affect particular business units as well as the company at large.
Key performance indicator is used interchangeably with metric.
KPI See key performance indicator (KPI).
leaf member A bottom-level member in a dimension.
materiality A criterion used to grade the importance of an BI opportunity
area based on how financially significant the opportunity is to the organization. Materiality ratings are high, medium, and low.
measure A numeric value that is of interest to business analysis.
member An item in a dimension that represents one or more occurrences of data.
mental model A collection of everything that we think we know about how
something works (in this case our business). This labeling of our understanding applies to not only people but also organizations. Some people refer
to the company’s mental model as “tribal wisdom.”
metadata Information about the properties of data, such as business logic
that describes the structure and content of dimensions and measures.
metric A measure that guides businesses in making decisions that affect
particular business units as well as the company at large. Metric is used
interchangeably with key performance indicator.
MOLAP See multidimensional online analytical processing (MOLAP).
multidimensional analysis A way of analyzing data in a top-down
fashion by examining measures simultaneously broken out by multiple
dimensions.
multidimensional online analytical processing (MOLAP) An OLAP storage
mode in which data is placed into special structures that are stored on a central server(s).
OLAP See online analytical processing (OLAP).
OLE DB An application programming interface (API) for accessing data.
OLE DB supports accessing data stored in any format (databases, spreadsheets, text files, and so on) for which an OLE DB provider is available.
OLE DB for OLAP Formerly the separate specification that addressed
OLAP extensions to OLE DB. Beginning with OLE DB 2.0, OLAP extensions
are incorporated into the OLE DB specification.
OLTP See online transaction processing (OLTP).
online analytical processing (OLAP) Multidimensional analysis that is supported by interface tools and database structures that allow instantaneous
access and easy user manipulation. Online analytical processing got its name
because this name contrasts well with OLTP, a term that was already in widespread use when the term OLAP was created. There are fundamental differences between transaction processing and analytical processing. OLAP systems
support multidimensional analysis at the speed of thought. OLAP typically
follows the client/server paradigm, where an OLAP database server is accessed
by many users who use multidimensional client tools to analyze data.
online transaction processing (OLTP) A data processing system designed
to record all the business transactions of an organization as they occur.
OLTP systems are structured for the purposes of running the day-to-day raw
data of business, which requires efficiency and minute processing of transactions at the lowest level of detail. An OLTP system processes a transaction,
performs all the elements of the transaction in real time, and processes
many transactions on a continuous basis. OLTP systems usually offer little or
no analytical capabilities.
open database connectivity (ODBC) A data access application programming interface (API) that supports access to any data source for
which an ODBC driver is available. ODBC is aligned with the American
National Standards Institute (ANSI) and International Organization for
Standardization (ISO) standards for a database call level interface (CLI).
operational database A database that supports the day-to-day operations of
an organization. Operational databases host the systems that organizations
use to run their business day to day. Most operational databases are OLTP
systems and store the data in a relational database management system.
mode in which data is placed into special structures that are stored on a central server(s).
OLAP See online analytical processing (OLAP).
OLE DB An application programming interface (API) for accessing data.
OLE DB supports accessing data stored in any format (databases, spreadsheets, text files, and so on) for which an OLE DB provider is available.
OLE DB for OLAP Formerly the separate specification that addressed
OLAP extensions to OLE DB. Beginning with OLE DB 2.0, OLAP extensions
are incorporated into the OLE DB specification.
OLTP See online transaction processing (OLTP).
online analytical processing (OLAP) Multidimensional analysis that is supported by interface tools and database structures that allow instantaneous
access and easy user manipulation. Online analytical processing got its name
because this name contrasts well with OLTP, a term that was already in widespread use when the term OLAP was created. There are fundamental differences between transaction processing and analytical processing. OLAP systems
support multidimensional analysis at the speed of thought. OLAP typically
follows the client/server paradigm, where an OLAP database server is accessed
by many users who use multidimensional client tools to analyze data.
online transaction processing (OLTP) A data processing system designed
to record all the business transactions of an organization as they occur.
OLTP systems are structured for the purposes of running the day-to-day raw
data of business, which requires efficiency and minute processing of transactions at the lowest level of detail. An OLTP system processes a transaction,
performs all the elements of the transaction in real time, and processes
many transactions on a continuous basis. OLTP systems usually offer little or
no analytical capabilities.
open database connectivity (ODBC) A data access application programming interface (API) that supports access to any data source for
which an ODBC driver is available. ODBC is aligned with the American
National Standards Institute (ANSI) and International Organization for
Standardization (ISO) standards for a database call level interface (CLI).
operational database A database that supports the day-to-day operations of
an organization. Operational databases host the systems that organizations
use to run their business day to day. Most operational databases are OLTP
systems and store the data in a relational database management system.
opportunity area In BI technical terms, the logical grouping of measure
requirements, where data can be obtained consistently across all the dimensions at the same lowest level of detail. In business terms, similar to a project where a consistent set of requirements for a group of users can be
accommodated more or less from the same end-to-end system structures or
solution.
requirements, where data can be obtained consistently across all the dimensions at the same lowest level of detail. In business terms, similar to a project where a consistent set of requirements for a group of users can be
accommodated more or less from the same end-to-end system structures or
solution.
parent A member that is directly above another member in a hierarchy.
pilot project A short-term BI project that tests the feasibility of pursuing a specific opportunity area.
pivot and nest Point-and-click manipulations that facilitate multidimensional analysis. Pivoting means rotating rows to columns, and columns to
rows, in a cross-tabular data browser. Nesting is layering multiple dimensions on the rows or columns of a browser.
power analyst A community of business users that requires the full analytical power of the data mart. These users are willing to learn the details of
database design and the query tool in order to obtain the necessary results.
predictive data mining Data mining that produces a model for use with
new data to forecast a value or predict a probable outcome based on patterns
discovered in historical data.
proof-of-concept project A BI project that evaluates and selects technologies that can be used to host a data mart.
ragged hierarchy A hierarchy that has an inconsistent number of drill-down
levels.
ratio A measure where the result is calculated specifically from dividing
one measure by another.
RDBMS See relational database management system (RDBMS).
refresh rate The frequency by which data is updated. Typically the refresh
rate corresponds to the lowest level of detail of a time dimension required
for a group of measures.
relational database management system (RDBMS) A set of programs that
allows users to create, update, and administer data that is stored in a database of related tables.
relational online analytical processing (ROLAP) An OLAP storage mode
where data is stored in relational databases.
pilot project A short-term BI project that tests the feasibility of pursuing a specific opportunity area.
pivot and nest Point-and-click manipulations that facilitate multidimensional analysis. Pivoting means rotating rows to columns, and columns to
rows, in a cross-tabular data browser. Nesting is layering multiple dimensions on the rows or columns of a browser.
power analyst A community of business users that requires the full analytical power of the data mart. These users are willing to learn the details of
database design and the query tool in order to obtain the necessary results.
predictive data mining Data mining that produces a model for use with
new data to forecast a value or predict a probable outcome based on patterns
discovered in historical data.
proof-of-concept project A BI project that evaluates and selects technologies that can be used to host a data mart.
ragged hierarchy A hierarchy that has an inconsistent number of drill-down
levels.
ratio A measure where the result is calculated specifically from dividing
one measure by another.
RDBMS See relational database management system (RDBMS).
refresh rate The frequency by which data is updated. Typically the refresh
rate corresponds to the lowest level of detail of a time dimension required
for a group of measures.
relational database management system (RDBMS) A set of programs that
allows users to create, update, and administer data that is stored in a database of related tables.
relational online analytical processing (ROLAP) An OLAP storage mode
where data is stored in relational databases.
ROLAP See relational online analytical processing (ROLAP).
roll-up The hierarchical aggregations of data typical in multidimensional structures.
segmentation A data mining technique that analyzes data to discover
mutually exclusive collections of records that share similar attribute sets. A
segmentation algorithm can use unsupervised learning techniques such as
clustering or supervised learning for a specific prediction field.
semiadditive aggregation A method of summarizing data from its lowest
level of detail to its highest level of detail in which measures are not aggregated uniformly across all dimensions.
sibling A member that is at the same level as one or more other members
sharing the same parent.
slice and dice Two complementary methods for interacting with data.
Slicing means isolating a specific member of a dimension for analysis. Dicing
means breaking a data set into smaller pieces by examining how measures
intersect with multiple dimensions.
slowly changing dimension A term that describes how dimensions reflect
data changes over time.
SQL See structured query language (SQL).
structured query language (SQL; pronounced sequel) An industry
standard language for accessing data (also called querying) in a relational
database management system (RDBMS).
uniform aggregation A method of summarizing data from its lowest level
of detail to its highest level of detail, where data can be aggregated the same
way across all dimensions.
visualization A graphical representation of data that sometimes reveals
patterns that are more apparent to the human eye.
write back The ability for users to update data in an underlying data mart.
zero client footprint A tool that does not require software to be installed
on a business user’s desktop, thus making the client application easier to
deploy to more users.
roll-up The hierarchical aggregations of data typical in multidimensional structures.
segmentation A data mining technique that analyzes data to discover
mutually exclusive collections of records that share similar attribute sets. A
segmentation algorithm can use unsupervised learning techniques such as
clustering or supervised learning for a specific prediction field.
semiadditive aggregation A method of summarizing data from its lowest
level of detail to its highest level of detail in which measures are not aggregated uniformly across all dimensions.
sibling A member that is at the same level as one or more other members
sharing the same parent.
slice and dice Two complementary methods for interacting with data.
Slicing means isolating a specific member of a dimension for analysis. Dicing
means breaking a data set into smaller pieces by examining how measures
intersect with multiple dimensions.
slowly changing dimension A term that describes how dimensions reflect
data changes over time.
SQL See structured query language (SQL).
structured query language (SQL; pronounced sequel) An industry
standard language for accessing data (also called querying) in a relational
database management system (RDBMS).
uniform aggregation A method of summarizing data from its lowest level
of detail to its highest level of detail, where data can be aggregated the same
way across all dimensions.
visualization A graphical representation of data that sometimes reveals
patterns that are more apparent to the human eye.
write back The ability for users to update data in an underlying data mart.
zero client footprint A tool that does not require software to be installed
on a business user’s desktop, thus making the client application easier to
deploy to more users.
Leave a Reply