Thursday, May 31, 2012

Please do not take weights lightly in the Big Data world

Don't forget the weights to play in the Big Data jungle. I have not this topic of "weights" given enough weight by many experts and practitioners, as well as the Big Data companies. The topic seems to be either ignored or not mentioned as i comb through the literature and attend talks and panels on Big Data.

Weights could be used at various steps of data collection to storage, structuring or semi-structuring, data migration, consolidation and synchronization, Map-Reduce type processing, variety of meta data creation and most importantly during data analysis using different data science and business intelligence, and surely data visualization stages.

Weights are very important as all data are not born and processed equally. In a way it has been a "data democracy" paradigm in the Big Data world. I believe in people democracy, even though all parts of the world may not believe not practice 'one person one vote'. The Big Data practitioners should wake up and use the "weights" ASAP.

Some of you may argue that the concept of weights is basically subjective. I agree with you. However, the practice of weights is important for the following reasons:


  • Just by assigning different weights to different sources and types of data, we recognize that all data are not born or processed equally.
  • The different data have different types of origins, contexts, utility, value in the value chain of aggregation or the process of combining. 
  • The moment we start thinking of weights associated with the data, we will consciously attempt to understand it weighted value in the analysis process.
  • The weights do not have to be fixed. They could be calculated based a a variety of contributing factors.
  • The weights and the weight calculation formulas, the different levels of aggregation and combination formulas could be refined as the understand of these improve.
  • We could utilize various iterative, recursive, correlation, confidence levels and statistical distribution techniques. You may contact me for any detailed discussions
As soon as you get up in the morning, you should lift some weights to be a Big Data champion.

Monday, May 21, 2012

The fundamentals of: Cause(s) → Effect(s) → $ → Election Dynamics → Congratulations Mr. President

The fundamentals of:

Cause(s) Effect(s) → $ → Election Dynamics → Congratulations Mr. President



Cause "Indie" Individual Personality Dimensions

1.Tradition, history, family, church (N)
2.Political and social agendas (M)
3.Track record (H)
4.Age demographics (H)
5.Profession / vocation (H)
6.Location (L)
7.Health (M)
8.Minority and immigiration (H)
9.Labor statistics - unemployment (volunteers-time-$)
NOTES: The dimensions will have weights:
L (Low). M (Medium) and H (High)
Numbered weights, independent, dependent, time dependent

Cause E  External Influence

1.Terrorism, war, defense expenditure and policy (H)
2.Economic Climate (Jobs / Unemployment) (H)
3.R&D (L/N) Investments
4.Journalists / Media (H+) Influnce Voters and Donors
5.Social institutions (H) – Churches, Special Interest Groups
6.Export/Import -> Jobs (H)
7.Environmental policy, Go green initiatives (M/H)
8.Energy policy and security (M)
9.Natural disasters (M/L)
NOTES: The dimensions will have weights:
L (Low). M (Medium) and H (High)
Numbered weights, independent, dependent, time dependent


Cause ---> Effect Analysis
1.Weighted parameters in an equation f (WnPm)
2.Can eventually correlate different parameters w/ each other. - Inter and auto correlation
3.Representation in the form of Venn diagrams
4.Venn diagrams - For example for Venn Intersections
5.Opinions / beliefs of individuals versus candidate
6."Aggregate" opinion could be used by media, candidate, party etc.
7.Reciprocally media, candidate and party messages and statements influence individuals and certain societal groups.

Methodology
1.Fishbone Diagram – Cause(s) --> Effect(s)
2.Multi Dimensional Chart - Influences between personality dimensions and correlations / synergy with candidate
3.Venn Diagram for analysis and visualization
4.Scenario Analysis - What is the effect if one or more parameters change?
5.Fish Eye Visualization - User can zoom in to any one cause and see the different effects
6. Cluster Analysis and Visualization
7. Mind Map Analysis and Visualization
8. Heat Map visualization
9. Zoom and Drill down

Big Data Methodologies
Data sources: Social Media and Networks like Face book and twitter
ÒBig Data Storage
ÒMap and Reduce Methodologies
ÒPlatforms - hadoop, mongodb, cassandra etc.
ÒBusiness intelligence -Microstrategy, jaspersoft etc
ÒVisualization - microstrategy etc.
ÒData Science - R and other SAS based daata science packages


Saturday, May 19, 2012

Donkey Versus Elephant

Donkey Versus Elephant


People in USA choose an animal they like every four years. They do not have a choice of all the animals in the forest or a zoo. They choose between a donkey and an elephant.


I have been talking about Big Elephants in this blog routinely. I alert you not to get confused about the BIG DATA icon or a mascot I use in the poetic analogy and the Republican Elephant and Democratic Donkey.


I am in Datafest event in Stanford University campus today Saturday the May 19 and tomorrow Sunday May 20, 2012, where I am currently listening to professors of the Journalism Department and an executive from the non-profit organization called Sunlight foundation. They are all talking about Presidential elections and how the campaign funding varies and is influenced.s


My interest is more fundamental. Voting, funding and elections at all local, state and national governmental are influenced by public opinion, which varies, forms and mutually influences and finally can be studied in terms of events, actions, influences through a variety and diversity of media and communication, organizations and events and family and society events, economy, international politics and events, and most importantly the phenomena of the post 2000 era: Social Media and Social Networking.


Let us see how the datafest is go to unfold.