Saturday, August 11, 2012

How to deal with Big Data!? MIT Article...

What are Social Media's larger problems?  One of them may be how to monetize their near 1 trillion member strong networks via strategic advertising by properly harnessing the "insights" from Big Data...  There is no clear road map on how to do this and many people are starting to charter potential routes for various industries.
Below is a great article that proposes means for many IT oriented companies to properly harness big data:

MIT Article

Monday, July 23, 2012

Thursday, June 14, 2012

Big Data Startups Making for an Easier Commute

Many emerging Big Data start ups are smaller B2B solutions providers that are not in the headlines, and they may never become mainstream names like Splunk.  In the recent Wall Street Journal article, ¨Tapping 'Big Data' to Fill Potholes,¨ several of these smaller startups are mentioned with a theme to help drivers to avoid traffic issues.  Intrix Inc. has turned its data analysis into a viable commercial businesses by generating revenue from the state of New Jersey and has plenty more highways in the world it can potentially expand to.  According to the article, Inrix and, “The New Jersey center offers a glimpse at the power of "big data," a term for techniques to gather reams of computerized information points, analyze them and spit out patterns, often in easy-to-understand visuals like maps or charts.”

In addition to traffic authorities having better information to deal with traffic concerns, Google maps and navigation systems are telling more and more every day to consumers about travel conveniences.  Both mobile phone applications as will as in car services such as OnStar make this possible.  These companies are using both live update information as well as historic traffic pattern data to predict congestion and travel time.
INRIX Inc. is not only getting involved with helping states to improve their traffic situation, they have also recently been selected by BMW to improve navigation and fuel economy efforts. This is a great opportunity for them and we will keep you posted on progress on their partnership.


In addition to Inrix, both RAC & Waze have interesting related stories:


RAC - Over in the UK the RAC uses vehicle data to identify congestion situations.  This insurance based firm has a business model that is designed to utilize navigation and data from vehicles to provide additional value for Breakdown Coverage services.

WAZE - Another startup, called Waze Inc. concentrates on mobile applications catered towards navigation and traffic patterns.  In fact they tell you the optimal times to travel for holiday weekends!  Check it out for your next vacation!


These are just a couple of the business models that are looking to establish commercial businesses of traffic and navigation.   If you are interested in other start ups leveraging Big Data, another great site called Beautiful Data recently came out with a list of Top 10 hot big data start ups that is worth taking a look at!  Let us know if you know of any other interesting Big Data efforts we should continue to keep an eye on!

Friday, June 8, 2012

Big Data Analytics - Techniques and Trends - continued..


Welcome back! So we continue to understand some more techniques and trends to analyze Big Data. Our idea is not for you to become experts in all of these, but hopefully to be able to germinate the seed of inquisitiveness in your mind and simultaneously touch upon the most prevalent concepts.

A couple of more widely used techniques trying to utilize Big Data potential:

Sentiment Analysis:  A technique to identify and extract subjective information from source text material. Key aspects of these analyses include identifying the feature, aspect, or product about which a sentiment is being expressed, and determining the type, “polarity” (i.e., positive, negative, or neutral) and the degree and strength of the sentiment. Examples of applications include companies applying sentiment analysis to analyze social media (e.g., blogs, microblogs, and social networks) to determine how different customer segments and stakeholders are reacting to their products and actions.

Predictive Analysis: A set of techniques in which a mathematical model is created or chosen to best predict the probability of an outcome. It deals with extracting information from data and using it to predict future trends and behavior patterns. The core of predictive analytics relies on capturing relationships between explanatory variables and the predicted variables from past occurrences, and exploiting it to predict future outcomes. An example of an application in customer relationship management is the use of predictive models to estimate the likelihood that a customer will “churn” (i.e., change providers) or the likelihood that a customer can be cross-sold another product. This is used in conjunction with some earlier described data analyzing techniques like data mining. Following video is sweet and short illustration by a Predictive Analytics company http://goo.gl/9k0sP


Now we look at some buzz words regarding Big Data Analytics as promised before, there are a growing number of technologies used to aggregate, manipulate, manage, and analyze Big Data, most of them are based on Distributed Computing platform, which is:

 - Massive parallel computing where a problem is divided into multiple tasks, each of which is solved by one or more computers working in parallel.

Here are some trendy technologies:

MapReduce: A software framework introduced by Google for processing huge data sets on certain kinds of problems on a distributed system. Check out this nice online presentation for a simple understanding http://goo.gl/Qz5PP

Mashup: An application that uses and combines data presentation or functionality from two or more sources to create new services. These applications are often made available on the Web, and frequently use data accessed through open application programming interfaces or from open data sources.

Hadoop: An open source (free) software framework for processing huge data sets on certain kinds of problems on a distributed system. Its development was inspired by Google’s MapReduce and Google File System. It was originally developed at Yahoo! and is now managed as a project of the Apache Software Foundation.

Although the scope of this genre of technologies is very vast and hard to bring under the purview of this post, nevertheless, we tried to make you familiar with the basic concepts. Do let us know your views, see you soon …..

References:
McKinsey report: http://goo.gl/ycvef
TDWI library reports: www.Tdwi.org
Wikipedia


Friday, June 1, 2012

Big Data Analytics - Techniques and Trends …


   Making sense out of BIG DATA
Alright! Now we have got tonnes of information about Big Data. Question is, how do enterprises make sense out of it? So let us explore the various Data Analysis techniques that are either 1) most commonly used by companies across various industries or 2) relatively new but show strong growth potential in the near future.Through a series of posts, we will try to touch upon these techniques. The idea is to get familiarized with the buzzwords around Big Data.


Although there is a buzz around “Advanced Analytics” these days for Big Data analysis, researchers claim that they are mostly built upon the fundamentals of “Business Intelligence” or “BI” techniques, so barring all tweaks, customization and modifications at the moment, let us grasp the basics first.

BI encompasses a set of computer based methodologies that help analyze and report/present large amounts of ‘structured’ or ‘unstructured’ data. Is this something new? Apparently not, it has been used by businesses since long to support various business related activities like decision making, predictions, number crunching etc. Checkout this marketing video by a company called Avitas giving an idea of BI and the prospects: http://goo.gl/blKTe

However, the context in which these techniques are being utilized is changing - that is to analyze Big Data, which is just data after all!

Here are some known techniques under BI:

1. OLAP – Online Analytical Processing:
A data retrieval process used for structured databases more commonly known as Data ware houses. The major focus of this technique is to query or retrieve and effectively combine data from multiple sources or dimensions aggregated in a relational structure. Commonly used are the OLAP cubes, which combine, analyze and present data from 3 different sources. A typical data extraction would read like: - Sales of a company’s product x in region y for a period z which has been extracted from data sets for products (x,y,z), regions (x,y,z), periods (x,y,z).

2. Data Mining:
A methodology used to extract patterns from large datasets by combining methods from statistics and machine learning with database management. Examples of usage might include mining customer data to determine segments most likely to respond to an offer, mining human resources data to identify characteristics of most successful employees, or market basket analysis to model the purchase behavior of customers.

Further drilling into this category, following are certain methods which are used independently or in conjunction with one another to analyze data or in extension ‘Big Data’ -  

- Association rule learning
A technique for discovering interesting relationships, i.e., “association rules,” among variables in large databases based upon a set of algorithms. One application is market basket analysis, in which a retailer can determine which products are frequently bought together and use this information for marketing (a commonly cited example is the discovery that many supermarket shoppers who buy diapers also tend to buy beer. you can refer to the Forbes article about the IBM computing which brought about that discovery here - http://goo.gl/UNIFS

- Cluster analysis
A method for classifying objects from diverse groups into smaller groups of ‘seemingly’ similar objects whose characteristics of similarity are not known in advance. An example of cluster analysis is segmenting consumers into self-similar groups based on collective group behavior for targeted marketing. Example - recommending a customer in a movie which was bought/liked by another customer in the same group. It is almost in contrast to simple ‘classification’, up next!  

- Classification
This method identifies categories in which new data points belong, based on a training set containing data points that have already been categorized based on similar traits. One application is the prediction of segment-specific customer buying behavior where there is a clear hypothesis or objective outcome.

Dear avid readers! Considering the heaviness of the data dose being provided in this post, we have decided to use a common technique in providing the most sought after information effectively – (No it’s not related to Big Data!) It’s simply called providing a 'sequel'. So keep visiting to find the next one soon where we will talk a bit more about some other basic techniques and introduce the latest trends like Hadoop, Mashup, MapReduce in managing BIG DATA .…

Sources and references for detailed report and materials:
McKinsey report: http://goo.gl/ycvef
TDWI library reports on BigData: www.Tdwi.org



Friday, May 25, 2012

The Bigness of Big Data.




If everybody is talking about Big Data it must be something very cool, don’t you think? Every day the term is mentioned in newspapers, websites, schools, business meetings, conferences… Currently, Big Data is all over, and that is exactly why we are writing about it.
EMC^2’s video which we featured last week helped us a lot in understanding what Big Data is all about, now let us present our interpretation of this in-vogue tech concept.
Massive amounts of data which cannot be handled with conventional tools are Big Data (BD). Imagine analysing all tweets posted in one country in a day, using conventional data base tools, tricky, right? There is so much information available in our world that it is becoming very problematic to use it. Big data applications allow people or companies to solve problems, converting unprocessed data to useful information.
Two of the big players in this market, IBM and EMC^2 identify three main dimensions of Big Data:
·      Size: Big data is certainly big. Data is available in enormous quantities.
·      Speed: Data is generated extremely fast. To be competitive, users need to process and to analyse the data very fast.
·      Variety: Data come in many forms and from many different sources. (Dates, Names, Bank Accounts, Bar Codes, Videos, emails, Tweets, Web Sites, etc.)
Big Data is useful in a wide range of contexts, some of our favourite applications: Electronic Payment for private or public companies. Agile analytics in the Stock Market. Business Intelligence for new ventures. Security: predicting or detecting fraud, And Data Warehouse in Social Networks.
Big Data is a tool which creates competitive advantages in business. Getting updated information from many more sources, and processing data faster will enable companies to understand their markets better, to anticipate crisis, and to make intelligent decisions. Those extracting value of the existing and growing data will be ahead of the competition. That is for us, the “bigness”, of Big Data.

As always, more on this topic in the coming days…


The image used in this post is a piece named Electress by Nick Gentry. He is British artist who recycles tech products like floppy disks to create his paintings. http://www.nickgentry.com/index.html