Was your Math Education a Waste of Time?

Reflections on the book A Mathematician’s Lament by Paul Lockhart

Because I am an IT professional, many people assume I must love math.  While I did OK at math in school, I never really enjoyed it.  A Mathematician’s Lament goes a long way to explaining why.

In contrast, I remember vividly my first contact with computers.  I was in grade 5.  I had Mr. B. (one of those cool male teachers who went by his initial only).  The computer was an Apple II E.  The code he showed us was simple:

10 PRINT “HELLO”
20 GOTO 10

Of course, this meant the computer would print HELLO forever, or at least until you hit the ESC key.  I was immediately stuck by the infinite potential of a machine that did exactly whatever you said.  I would go on to spend years tinkering with code, writing home-made computer games and otherwise experimenting with these great machines (one silly experiment involved seeing how much abuse a 5 1/4″ disk could take before becoming completely unreadable – surprisingly, it could take a lot!)  I went on further to win multiple awards as a top computer student and yet never really pictured myself working with them, doing a business degree instead.  Computers were for fun!

Math, on the other hand, was pure drudgery.  The timed multiplication table quizes in elementary school were the worst – to this day I freeze up on any short timed event (I am unable to play Scategories with the timer).  Throughout my mathematical education, I often thought “if only I could get a computer to do all this!”  Mathematical proofs in high school were no better – I took them as exercises in irrelevance that simply had to be memorized.  My success in high school math came only from monotonous repetition of exercises.  I took out my frustrations in math with a doodled character I named “Math Man” who had gone insane because of math, loosely based on Bloom County’s Bill the Cat.

Once in second-year university, an economics professor realized that most of us had no clue about calculus (even after finishing 2 courses on it) so he took the time to give us an economist’s eye view of calculus, and for the first time calculus actually made sense.  Sadly, moments of inspired understanding such as this were few and far between.

What Lockhart tells us is that the way math is taught in schools is wrong.  It kills creativity, innovation and true understanding.  I agree completely.  Lockhart argues that if music were taught in the same way, students would spend years transposing written music from one key to another but never hear a song.  He argues that high school math proofs are written in jargon and incomprehensible, even to most mathematicians.  For a truly successful mathematical education, it needs to be experimentational and even fun.  Math education should involve play and discovery, puzzles and games, and moments of eureka!

In a sense, my education in computers and mathematics gave me very opposite experiences.  One was fun, free-spirited adventure; the other boring, formalized and sterile.  Lockhart tells us it doesn’t have to be this way.  I highly recommend his book to anyone who felt that his or her mathematical education was a waste of time.

No Comments

The Panic Factor

“Don’t Panic!” – The Hitchhikers Guide to the Galaxy

Panic
Creative Commons License photo credit: scott1723

I often find that when business users are faced with large, even colossal, discrepancies in data sets, their first response is sheer panic.  For me, the larger the discrepancy the calmer I am about it.  Why?  Because discovering source problems in data sets is almost always much easier the larger they are.  The truly bedevilling problems are the inconsistent, small, and seemingly random discrepancies that occur.

What do I find the number one cause of numbers being 5, 10 or even hundreds of times greater than they should be?  A time series snapshot that is summing all time periods.  This is remarkably easy to fix, and can happen easily in a data cube without proper current time period settings.  It is a common mistake for junior report developers, especially when they are expecting to see current period only.

Other common and simple problems include rounding errors or inconsistencies, improper unit of measure calculations, unexpected null values nixing a summarization total, or cube update failures.  Always check for obvious and easy solutions first, and work your way down to more complicated resolutions as necessary.  Note when a problem started and attempt to isolate what has changed since then.  Be methodical and don’t panic!

No Comments

The Dangers of Self-Service BI

Self-Service BI can be a wonderful thing.  Ideally, business users will have a wealth of corporate information at their fingertips and be able to produce meaningful reports quickly and easily themselves, freeing up time for BI developers to work on other meaningful BI initiatives such as scorecarding, data warehousing, data mining – just to mention a few.

The real danger, however, is that Self-Service BI can be an IT driven initiative to reduce workload on itself, often resulting in a product built entirely from an IT perspective with limited input from business since they are often “too busy” to talk about BI.  The end result can at best be confusing and at worst useless to the business user community.

In my experience, the best results in BI are achieved when developers and business users work closely and collaboratively on business and technical issues.  When the developer or data modeler can understand the business being modeled and the business user can understand the basic technicalities of the design, a true win-win can be achieved.  This is not to say that a developer must fully understand the complete business end-to-end, or that the business user must understand each line of code.  But when each can see the solution from the other’s perspective, an optimal solution is close at hand.

Overall, Self-Service BI can be successful if:

  • The scope is well defined
  • The business is well-defined and understood by developers and modelers
  • Business users are active participants in planning, designing and testing
  • The business user community is well trained in the BI environment’s reporting tools

This last point is particularly essential if business users are expected to create their own reports.

No Comments

An Introduction to the Cognos SDK, Part 4 – Cognos 8 Extended Applications and Conclusion

In this final entry in our series on the SDK we’ll touch on Cognos Extended Applications, a set of JSP tags that enable the development of custom “portlets” that can be hosted by a number of “portal” applications, including IBM’s Websphere portal and SAP’s Enterprise Portal (as well as Cognos Connection, of course.)

JSP technology includes the ability to create custom tags that contain back-end code. Just as a set of tags like <a> </a> means something specific to the browser, and <%> </%> are used to demarcate code that will be executed on the server side (but is contained within the JSP page) the user can create tags of their own that will call code when the page is rendered but exists in a java class the user has created.

The Cognos SDK includes several tags libraries that can be used to create JSP pages that can then be registered with a portal. Once again, these libraries enable the user to essentially perform any task that can be performed through the normal interface, but extended or combined as the user wants. Once registered the JSP portlet will be available to users of the existing corporate portal. The details of registering a JSP page with the portal vary with the portal being used.

The Extended Applications approach is ideal for the development of content you want to present within an existing portal, but it is complex, and requires the use of the JSP portion of the J2EE stack.

The C8 SDK is a powerful tool for organizations that want to extend the power of Cognos 8 within the enterprise, and provides a number of tools to do so. The BI Bus API provides a set of classes that can be used to build applications that interact with C8 “from the ground up”. The URL API provides an easy, ‘lightweight” way of calling methods to interact with C8 via a correctly formatted URL, or from JavaScript within a client. Finally, Extended Applications provides a set of custom JSP tags within libraries that can be used to create JSP pages that can then be registered as portlets within an enterprise portal.

3 Comments

Security Studio by First Quarter

Lockdown
Creative Commons License photo credit: 2Tales

Cognos security in its simplest form is very straight-forward:  You either can access the Cognos environment or you cannot.  Typically most Cognos clients I know shy away from highly complex hierarchical security implementations within their organizations, where managers, regions, departments and/or VPs all have specialized access rights to reports, filtered data or even column-level control.  Security is usually broken down into a very small number of groups (if even that much) and reports are usually on a “see-it-or-not” approach.

The reason that most Cognos security implementations are kept so simple, I believe, is that managing a highly complex Cognos security implemenation is a real chore.  The more granular and complex the security, the more burdensome the task.  Furthermore, this is not a one-time setup task, but an administrative requirement to maintain for the life of the Cognos environment, as users move around the organization.

But there is a product that can help.

I recently attended a demo of Security Studio by First Quarter.  The product allows you to define security easily throughout your Cognos 7 and Cognos 8 environments.  You define your users, groups, roles and rights with a simple interface, and then Security Studio updates these security settings in the Cognos environments.  I could easily see how this would be a huge time-saver for a complex hierarchical security implementation.  It’s certainly worthy of consideration if you are mantaining or considering a complex Cognos security environment.

, ,

No Comments

As Easy as 1-2-3

A Book Review of The Math Instinct by Keith Devlin

Abacus, Filofax, wrong result
Creative Commons License photo credit: matsuyuki

As part of my summer reading, I read the fascinating and informative book The Math Instinct.   This included an interesting journey through mathematics in the natural world, plus a review of how math works in the human mind.  It was this latter part that really intrigued me the most.

The book opens with recent studies that show newborn babies have the innate ability to count to 3, even at a few days old.  It turns out all humans, including the ancients, have the natural ability to count one, two and three.  In fact all numbering systems use a similar system to show these numbers as either a dot or line.  What about our representation of 1, 2, 3?  Looking back at the ancient Indian script these numbers are based upon, they match the pattern as well:

1, 2, 3 in Ancient Indian script

1, 2, 3 in modified Ancient Indian script (without lifting the pen)

What’s more, studies show that people are very good with arithmetic when it is used in a meaningful context.  Children shopkeepers in Brazil were shown to have good math skills at their market stalls, but failed abysmally at identical math questions in a formal classroom setting.  The same was found with adult shoppers and carpenters.  Why?  Because when math is reduced to symbols it quickly loses its meaning to most of us.

And what about those pesky times tables?  Our minds are made to recognize patterns, and it is this pattern recognition that messes us up when it comes to multiplication.  A typical six year old child has a vocabulary of between 13,000 and 15,000 words but will struggle to learn the 18 numbers in the single digit times tables times (removing repeats and simple ones like times 1, times 2, and times 5).  It is pattern interference that prevents us from learning this easily.

In short, we are all better at math than we give ourselves credit for.  We have no trouble determining the larger size of a product when shopping and we are very good to spot a bargain.  This book will help you see math differently, and open up your mind to the possibility that we may all be math-smart after all.

No Comments

Good King Censusless

An introduction for my many international followers:

King:  Canadian Prime Minister Stephen Harper
Page:  Canadian Industry Minister Tony Clement
Statistician:  Recently resigned head of Statistics Canada, Munir Sheikh

Special thanks to data quality expert Jim Harris whose Dr. Seuss-style data quality limmericks and songs served as a partial inspiration to this piece.  His blog can be found on my Blogroll (Obsessive Compulsive Data Quality).

Enjoy!

Good King Censusless looked out
On the cottage season.
With the sunshine round about
Warm and crisp and even.
Everyone was drinking beer
Feeling great elation.
How could he disrupt the cheer
breaking cross the nation?

“Mr Clement, come by strife,
If you know so, say it.
How can I make foul the life
Of the summer respite?”
“Sire, a man I once knew long
Loathed the census taking
If you could remove this wrong
You’d be nation-making.”

“Make it so”, he said at once
With no consultation,
“Though I may be thought a dunce
Causing consternation.”
Statistician would not toast
His part in this madness.
He would rather quit his post
Causing him much sadness.

Harper bellowed “What a fool!
Get that man to focus!
He should know that math’s not cool,
Stats are hocus pocus.”
Statistician stood his ground
In the public’s favour.
He said he was honour-bound;
People saw him braver.

“Bring me hatchets, bring me fire,
We shall burn his cabin!
He’s earned my unholy ire!
He won’t know what happened!”
Page and Monarch, off they trode,
Off they trode together
Feeling stormy, yet instead
Of the sunny weather.

Statistician’s cabin burned
To the ground next morning.
Page and Monarch never learned,
Though this be a warning:
Cabin dwellers all be sure
Be you all accounted,
Those who cannot count the poor
Can’t themselves be counted.

10 Comments

Privacy in the Information Age

An IT developer recently compiled a list of 100 million Facebook users and posted it to BitTorrent for sharing. Much was made of this “security breach” (Facebook denied a “security breach” since all data compiled was publicly available anyway). What was it that was listed? Any profile information made public, such as your name, location, and your list of friends. This is public on my Facebook profile so I assume I am on the list.

Now virtually all my Facebook information is restricted to friends only, including pictures and posts. Only my friends list and a restricted personal profile are shown to non-friends. But if I remove this, I would be rendered virtually invisible on Facebook, and what would be the point of that? This is public information precisely so I can be discovered on Facebook by lost acquaintances. I have found at numerous acquaintances that I had lost contact with for more than ten years in this way. The purpose of Facebook is to be discovered. But I restrict everything else to friends.

Conversely, virtually of my LinkedIn information is totally public. Why the difference? One is to share privately with friends, the other is to build my career which requires exposure to the greater world.

In short, the information I provide is given to better myself. One so I can be found in a world-wide directory, and the other so I can increase business contacts and share my professional experience.

Similarly, the Canadian census can be seen in a similar light. Data given to Statistics Canada is used to improve our communities, our cities, and our country. It is used in urban planning, such as the location of new schools and hospitals. It is used when determining where to start a business, or a found a new place of worship, or a new job training program. To refuse to be counted is to literally not count in your community. It’s like not voting. You are only hurting yourself.

However, when a government argues that a census is a) unnecessarily intrusive, b) a tedious and academic exercise in futility, or c) a secret nefarious plot by shadowy yet unnamed enemies-of-freedom, they are both demonizing and belittling the data collection process itself. Such action will result not only in a drop in confidence in any attempted voluntary census but will also erode confidence in Canada’s public institutions. And that hurts all of us.

We live in the information age. It’s time the Government of Canada recognizes that.

, ,

No Comments

Why Data Quality Matters

Why do we collect data?  What is it good for?  Do we even need it?  These are the questions that I see posed in the Canadian census debate.  As a data practitioner, I have seen my share of useless data, poor data, fudged data, and absolutely essential data.  Today’s corporations wade through masses of data to find nuggets of data gold.  Running a corporation today without data is like flying a modern aircraft without a functioning navigational system.  The same could be said of running a government.

Censuses have been conducted in all sophisticated societies in history, usually with the most up-to-date technology of the day.  The U.S. Census of 1890 employed the newly invented Hollerith tabulating machine.  Within decades tabulating machines were essential to major enterprises.  Following a merger in 1911, Hollerith’s company was renamed International Business Machines in 1924.

Census data collection has evolved since then, with some trail-blazing nations forgoing the census altogether.  But make no mistake: in place of mandatory long forms, there is a centralized registry of citizens complete with national ID numbers.  I think this is a good and efficient system but would libertarians ever agree to this?  Surely not if a 20% chance of filling out a form once every 5 years is too “invasive”.

Why doesn’t a voluntary form work?  Simply put, “responder bias”: your sample population is self-selecting or otherwise skewed.  In one of the most famous cases of responder bias in history, George Gallup correctly called the 1936 presidential re-election of Franklin D. Roosevelt when everyone else got it wrong.  Most other pollsters of the day sent mail-out ballots to potential voters based on phone numbers and car registries.  But in those days, millions of voters had neither telephones nor cars!  How did Gallup do it?  He sent pollsters to talk to people in person.  And hence the Gallup poll became a mainstay of politics.

198/365 - Quality
Creative Commons License photo credit: aithom2

Any census must deal with the question of data quality.  Much has been made of the “Jedi Knight” entries under “religion”.  How companies deal with data quality is by employing standards or business rules against a data set.  Certainly collecting data as close to its source as possible is a very good way to ensure quality data, as is automating data collection.  But what is proposed in Canada will weaken data quality, not strengthen it.  No superior alternative is being proposed.

Is the long form census perfect?  Not at all.  Is it 100% correct?  No.  Is it labour-intensive and quickly outdated?  Yes.  Could we collect data in a better way?  Yes.  But it is better by far than a voluntary form because a voluntary form will degrade data quality.

And why does data quality matter at the end of the day?  Because bad management starts with bad data.  Sometimes bad data is systemic, such as that which led to the global financial crash of 2009.  Sometimes bad data is deliberate, such as that which led to the rise and eventual demise of Enron.  But hiding or fudging data is dangerous and damaging – it will be discovered eventually and your reputation will show it.  Whether you are trying to hide toxic assets, off-balance sheet debt, shoddy manufacturing, unsafe products, poor employee performance, or entire segments of your population, you will be found out by independent researchers, international governance organizations, concerned consumers, outraged citizens or inside whistleblowers.  And the day of telling will not be pretty.

5 Comments

How Denmark does it: The Danish Census Model

Throughout the ongoing controversy in Canada over the end of the mandatory long form census, many have argued that Denmark (among other Scandinavian countries) no longer conducts a census.  I asked fellow data professional and blogger Henrik Liliendahl Sørensen to explain how his country manages population data as a guest contributor to BIProfessional.com:

Census Options: The Scandinavian Model

Dannebrog
Creative Commons License photo credit: @boetter

The Scandinavian model exemplified through the Danish variant does not require citizens to periodically fill out a census form.  Census information is extracted automatically when needed from administrative registers.

When a new Danish citizen is born (typically at a hospital) the child is assigned a national identification number within minutes. The ID is linked to the mother’s ID and, if she is married, also automatically to her husband as father as well. Otherwise the father’s ID (if possible) is obtained within a short time. In case of immigration, procedures exist for assigning national ID and collecting basic data. All information is kept in a centralized citizen registry.

The less romantic consequence of a marriage is that the two national IDs are linked in the citizen registry from that day forward. A divorce will result in a deactivation of the link.

All buildings, and if not a single family house, all the apartments within, are reflected in a centralized registry. When establishing a new house or apartment a lot of data is captured and if the residence is changed the data will be updated.

Your place of living is a relation between your national ID and the unique ID of the residence having the valid-from-date being the day you moved in until the day you move on is registered as the valid-to-date.  

Practically all events in the life of a citizen involving a public sector body are logged with the national ID. This also includes healthcare and interaction with financial services and employer relations where mandatory reporting exists.

The technical opportunities for compiling census information based on these registrations are plenty. However every case must be approved by a body within the authorities and wherever possible data must be made anonymous in the actual processing.

1 Comment