Thesis on "Data Mining"

Home > Topics > Computers & Internet My Account

Thesis 10 pages (3527 words) Sources: 8

[EXCERPT] . . . .

Data Mining

Evaluating Data Mining as a Strategic Technology

The ability to quickly gain insights from a diverse and often incompatibles set of databases and data sets are possible when data mining techniques are used. Data mining is the process by which very large datasets are analyzed for trends, patterns, insights and intelligence not discernable from a cursory analysis of the data sets themselves through manual means (Osei-bryson, Rayward-smith, 2009). Data mining is the study of how to glean insights and intelligence from data sets which are often not integrated with each other in a common database, further adding a level of abstraction to the analysis, making its interpretation even more difficult (Buddhakulsomsiri, Zakarian, 2009). There is an exceptional level of insights that can be gained by evaluating data mining as a strategic technology. The use of data mining for auto warranties for example (Buddhakulsomsiri, Zakarian, 2009) where there is a massive amount of data to interpret in completing government reporting requirements, is a case in point. The intent of this analysis is to evaluate data mining as a strategic technology.

Evaluating Data Mining as a Strategic Technology

The continual refinement of data mining from a technology to platform on which solutions for analyzing, monitoring and defining are built continues at an accelerating pace (Osei-bryson, Rayward-smith, 2009). The levels of economic uncertainty and the need companies have to compete using intelligence is one of the primary factors driving its adoption and growth (Li, Wu, 2010). Global economic recessions tend to be the catalysts of information techno

Continue scrolling to

download full paper ⤓

logies that have the potential to deliver inordinately large increases in insight, competitive and market intelligence. The use of data mining is accelerating as a result of companies across all industries seeking to gain a competitive advantage through analysis of their channels, customers, suppliers and own internal processes as well.

Examples of data mining abound in industries that have an exceptionally large amount of information they have collected form customers. This includes but is not limited to aerospace and defense (Cressionnie, 2008), auto manufacturers including aftermarket auto warranty analysis and lifetime product quality of automobiles (Buddhakulsomsiri, Zakarian, 2009), customer relationship management (Sun, 2006), eduation (Velasquez, Gonzalez, 2010), healthcare (Li, Wu, D2010) and many others. Despite the diversity of these industries they all share a common need for gaining greater insights into the interrelationships hidden in structured and unstructured content in their organizations. All also share the need for using the data in their companies for getting an understanding of how strategies in place today will yield results in the future (Kuhn, Ducasse, Girba, 2007). Data mining also requires an intensive level of data integration across databases, legacy and often standalone systems, in addition to a redefining of the most critical processes used for accumulating information in the first place (da Cunha, Agard, Kusiak, 2010). The intensive nature of data, system and process integration however can yield significant insights and intelligence not capable of being captured before.

The intent of this analysis is to evaluate the essentials of data mining include its definitions, assess data mining as a technology trend, analyze how data mining and its many associated technologies are managed and used at Google, and assess the future direction of data mining as well. Data mining is also leading to the development of text mining applications that take in massive amounts of unstructured text and create linguistic models from the data so new insights can be found including the emerging field of customer sentiment analysis (Li, Wu, 2010). CRM-based implementations of data mining often include sentiment analysis which provide insights into branding and perceptions of companies obtained through social networks (Sun, 2006). The future of data mining is going to include sentiment analysis and the ability to ascertain attitudinal data from the massive amounts of data being generated from social networks (Lai, Liu, 2009).

Defining Data Mining

Definitions of data mining vary significantly in scope and inclusion or exclusion of key concepts. The most common definition includes the four types of relationships including classes, clusters, associations and sequential patterns (Han, Kamber, 2000). Data mining definitions also vary in their reliance on the level of insight and intelligence that these processes deliver, with the most recent concentrating on linguistic modeling being able to determine sentiment and attitudinal scaling based on social networks' unstructured content (Li, Wu, 2010). The more mainstream definition of data mining however concentrates on the integration of disparate, often non-integrated systems together so that a single system of record can be produced upon which analysis, queries and advanced extraction can be performed (Berry, 2004). The use of Extraction, Transfer & Load (ETL) technologies and Online Analytic Processing (OLAP) are often used for creating reporting and analytical frameworks that organizations use to streamline the analysis, reporting and continual updating of databases in a data warehouse, which is used for completing data mining tasks (Rutledge, 2009).

While there are major differences in these definitions of data mining, they all share the common mission of unifying the analytical, transaction and customer-based databases that are prevalent throughout organizations. Data mining applications are used for determining patterns, relationships and the relative strength or weaknesses of causality in data sets, often looking to bring greater intelligence to transaction-based records and databases (Maggioni, 2009). In many data mining systems the overarching objective is to find greater levels of insight into transactions so that more effective selling and CRM-based strategies (Sun, 2006) can be accomplished. Definitions of data modeling also vary in terms of their reliance to the underlying technologies for finding relationships in the data itself. Traditionally statistically-based analytics applications were used for looking at causality and the strength or weakness of interrelationships in the data itself (Cressionnie, 2008). There are also data mining applications that seek to create neural networks (Han, Kamber, 2000) that can interpolate the relationships between data elements and create causal-based models over time. Google is using data mining not only to determine how users are accessing their search engine, for the definition of personalization (Stamou, Ntoulas, 2009) and for the development of linguistic models through latent semantic indexing (Kuhn, Ducasse, Girba, 2007) which gives the search engine provider a better understanding of how to index the Internet.

Classes, clusters, associations and sequential patterns are the four types of relationships that data mining applications seek to discover and add insight to (Stamou, Ntoulas, 2009). Classes are as the name suggests stored data that provide segmentation-based insights, including the purchasing behavior of customers and their demographic characteristics. Classes are often used as segmentation criteria across all industries that rely on data mining. Clusters are the second type of relationship that data mining applications look for in analyzing data sets and systems of record (Stamou, Ntoulas, 2009). Clusters are data items that are grouped through previously defined customer relationships and preferences, and as a result these are also used in the development of market segments. The use of clustering has also been used in the development of linguistic modeling to determine customer audiences within segments including the definition of consumer affinities for given channels of communication and methods of learning about new products (Sun, 2006). Data modeling in this regard has been instrumental in the development of entirely new approaches to managing communications and the integration of social networking applications into the multichannel messaging strategies of companies as well. The third type of relationships that data mining applications look to capture, validate and report on is associations. The classic connection of husbands and young fathers who purchase beer and diapers in the same grocery store run is an example of this type of relationship (Li, Wu, 2010). The last type of relationship that data mining applications seek to find are sequential patterns that are used for predicting future behavior of a specific audience or customer segment including the development of mass customization selections for build-to-order products and services (da Cunha, Agard, Kusiak, 2010). The use of sequential patterns for the development of cross-sell and up-sell selections in e-commerce systems is becoming more prevalent as this type of data mining gains adoption and integration into e-commerce platforms. The development of mass customization product strategies is highly dependent on this ability to determine sequential associations between products as well. The use of linguistics modeling and latent semantic indexing within Google is another example of how this approach to discovering and analyzing sequential patterns over time (Stamou, Ntoulas, 2009). The use of these linguistic models to also determine specific personalization requirements for each search on Google is an example of data mining taken to a highly personalized level (Stamou, Ntoulas, 2009).

The foundation of all data mining definitions also include five major elements that illustrate the major process steps required for data mining applications to be successful (Li, Wu, 2010). These include the first stage of extract, transfer and load (ETL) of data into the data warehouse systems (Stamou, Ntoulas, 2009) so the data can be quickly queried and used to create models for continual analysis of data sets. The second… READ MORE

Download Full Paper

Write a New Paper for Me

Quoted Instructions for "Data Mining" Assignment:

“

* This class is called Information of Management Systems and my topic is about *****Data Mining*****.

The research paper should address the following:

1.) Provide a definition of the topic (in some cases there may be multiple definitions, if so distinguish them)

2.) Discuss why it is an important technology trend

3.) Provide at least one example of vendor who markets the hardware or software that applied to your topic

4.) Provide a least one example of a company that implemented your topic; discuss their results

5.) What do you see (and why) as the future trend of your topic

6.) Use topic headings within your paper to organize its content

* There are many current trends in information technology. Some provide lasting value to organizations and some can be considered fads. The ones that provide lasting value are important to an organization either because it supports operations that result in increased sales, service or efficiency or because they aid in organizational decision making.

* The research paper should be ten pages, double spaced, and use MLA format. Please do not use Wikipedia as a source *****“ it is not reviewed and contains some erroneous information. The research paper must use at least eight different references for which at least three cannot be older than one year (Why? *****“ because technology changes rapidly). You may use more references if you wish.

Paper Format:

1.) Include a title page with your name and title of the report (doesn*****t count as part of your page count)

2.) Use Times New Roman 12-point font

3.) Report should be 10 pages (double space)

4.) Include page numbers

5.) Include a reference pages that identifies all references used (also doesn*****t count as part of the page count)

6.) Reference should be in the follow format:

Boudreau, M., Gefen, D., & Straub, D. W. (2001) Validation in Information Systems Research: A State-of-the-Art Assessment, MIS Quartley, 25 (1), 1-16

Sources:

Some possible sources of IT information:

1.) MIS Quartley

2.) Journal of Computer Information Systems

3.) Journal of Information Systems

4.) Harvard Business Review

5.) Journal of Management Information Systems

6.) Sloan Management Review

7.) Journal of Data Warehousing

8.) Journal of Database Management Systems

9.) Journal of Strategic Information Systems

10.) CIO magazine

11.) Information Week

12.) Books

13.) Internet

* Please email me if you have any questions regarding this research paper. Please let me know if you understand this agenda. Thank You!

How to Reference "Data Mining" Thesis in a Bibliography

“Data Mining.” A1-TermPaper.com, 2010, https://www.a1-termpaper.com/topics/essay/data-mining-evaluating/533946. Accessed 5 Oct 2024.

Data Mining (2010). Retrieved from https://www.a1-termpaper.com/topics/essay/data-mining-evaluating/533946

A1-TermPaper.com. (2010). Data Mining. [online] Available at: https://www.a1-termpaper.com/topics/essay/data-mining-evaluating/533946 [Accessed 5 Oct, 2024].

”Data Mining” 2010. A1-TermPaper.com. https://www.a1-termpaper.com/topics/essay/data-mining-evaluating/533946.

”Data Mining” A1-TermPaper.com, Last modified 2024. https://www.a1-termpaper.com/topics/essay/data-mining-evaluating/533946.

[1] ”Data Mining”, A1-TermPaper.com, 2010. [Online]. Available: https://www.a1-termpaper.com/topics/essay/data-mining-evaluating/533946. [Accessed: 5-Oct-2024].

1. Data Mining [Internet]. A1-TermPaper.com. 2010 [cited 5 October 2024]. Available from: https://www.a1-termpaper.com/topics/essay/data-mining-evaluating/533946

1. Data Mining. A1-TermPaper.com. https://www.a1-termpaper.com/topics/essay/data-mining-evaluating/533946. Published 2010. Accessed October 5, 2024.

Related Thesis Papers: