Saturday, October 31, 2015

Plagiarism Case Study, or, Holy Teachable Moment!


Brian Wise, a music writer for WQXR who contributed stories to NPR, was fired this past week (10/28/2015) for plagiarism. In an amazing exhibit of transparency, NPR has posted all of the offending passages found, along with links to the source materials from which Wise stole.


What I find most interesting is that Wise often tweaked and adjusted the plagiarized passages so that they would fit within the rest of his writing, the vast majority of which appears to be original. In other words, Wise was doing what students are tempted to do: visiting websites and borrowing a small number of phrases and sentences without attribution. If Wise were a student facing our school's honor council, he might well escape with a letter of warning.

Wise’s apology sounds almost exactly like what students of mine have said when caught red-handed in plagiarism cases. Although Wise acknowledges responsibility, he minimizes his guilt by calling his lapses “unintentional” and saying, “NPR and WQXR have identified some sentences and phrases in my work that were similar to those used in other media outlets.” Similar, my eye--many of the stolen passages are word-for-word duplicates. However, if using Ctrl+C and Ctrl+V (copy and paste) is a habit, then I agree with him that at least some of his lapses were unintentional.

Idea: Require all initial drafts to be written in longhand or banged out on an old-fashioned typewriter, so that there is no temptation to use Ctrl+C and Ctrl+V. Such a policy would help professionals as well as students.

Wise surely knew that there is a correct way to use an unusual turn of a phrase (e.g., “lumpy piety” taken from a 3/13/2000 article by Mark Swed and used in Wise's 4/19/2013 WQXR blog post), and there is a wrong way. The correct way is to give a tip of the hat to the author, saying something like, “The symphonies exhibit, in the words of Mark Swed of the Los Angeles Times, a certain lumpy piety.” The wrong way is to use Ctrl+C and Ctrl+V.

Friday, November 21, 2014

Not Everything Is Normal, or, How the well-compensated "geniuses" on Wall Street turned the Central Limit Theorem into a bowl-shaped mess

In the world of nature and in the world of manufacturing, extreme skewness is essentially unheard of. Outliers, though they occur from time to time, are rare, and therefore the “> 30” rule of thumb given in many stat textbooks (some books advise “> 25”) is safe whenever we are dealing with data distributions in most real-world situations.

For readers unfamiliar with what I’m talking about, the subject is Gaussian (a.k.a. normal) distributions and the closely related Central Limit Theorem, hereafter abbreviated CLT.

The rule of thumb says that the CLT can be safely applied whenever the sample size, n, is greater than 30 (or 25, depending on your textbook). The CLT, in turn, says that the sampling distribution of the sample mean is well approximated by a normal distribution centered on the true population mean, with a standard deviation that is smaller than the population’s standard deviation by a factor of the square root of n.

I know what you’re thinking. Bore, snore, zzzzzz. Who cares?

Well, the correct (or incorrect) application of the CLT can have huge consequences in situations that require parameter estimation in the face of uncertainty.

As I said above, the CLT can safely be applied whenever n > 30 in data from most real-world situations. However, two important exceptions are (1) the world of technology and (2) the world of banking and finance. Are you seeing where this could get us into real trouble?

There have been a few human beings taller than 7'6", but there has never been a human being 700 feet tall. However, multiple-order-of-magnitude differences for outliers do occur in the worlds of technology and banking/finance. Amazon.com receives hundreds of millions of page views per day, which is about 7 orders of magnitude higher than the Zocs.org blog you are reading, and a billionaire (of whom there are now about 500 in the U.S.) is 4 orders of magnitude above where most of us are on the wealth scale.

Sensible parameter estimation means that we compute not only a point estimate of whatever it is we are trying to estimate, but also a confidence interval that tells us the set of values that we are fairly sure are likely values for the true parameter. For example, if we want to know President Obama’s job approval rating, we can poll 100 randomly chosen adults and simply ask them their opinion. Some will say they approve, and some will say they disapprove. The margin of error (for a 95% confidence interval) with a survey that small is about plus or minus 10 percentage points, which means that if 42 people in our survey say they approve of President Obama’s job performance, we can be 95% confident that the true percentage, nationwide, is somewhere between 32% and 52%. If we want a more accurate poll, we need to interview more people. With a sample of 1000 people, the margin of error would be much smaller, only about plus or minus 3 percentage points.

However, we cannot use a random sample of hit counts from 100 web pages to estimate the mean hit counts per page for the Internet as a whole, not even with a large margin of error, since the Internet exhibits extreme skewness. That is to say, we could compute a point estimate, but it would be worthless, since the confidence interval we compute will usually not give us a true picture of our level of knowledge. If our sample of 100 happens to include Facebook and Amazon, we will wildly overestimate the parameter, whereas (if as is much more likely) our sample excludes Facebook and Amazon and Yahoo and Google and all the other top sites (which together account for the bulk of page views), we will dramatically underestimate the parameter. There are billions of web pages on the Internet, after all, which means that a random sample of 100 has practically no chance of including those big ones. Increasing the sample size doesn’t help much, either, since the confidence intervals will still be misleading and just plain WRONG. If you say you’re 95% confident that the true value of a parameter lies between 1.08 and 2.70, and the true value is 36,521, then you’re WRONG. Not just a little bit wrong, but colossally and embarrassingly wrong. The problem is that a relatively small number of websites account for most of the page views. If you want to estimate the mean hit counts per page, or the total number of page hits, you’re going to have to use a measuring technique that accounts for the big players separately from the little ones. And that is feasible, in the case of web pages, since we happen to know who the big players are. But what if the parameter we are trying to estimate is something truly unknowable, such as the probability of a general financial collapse?

The web page example above was an example from the world of technology. For an example from banking and finance, think of the extreme skewness and extreme outliers seen in wealth and profits. Two people, Bill Gates and Warren Buffett, all by themselves, have approximately as much wealth as the lower 50% of Americans combined (approximately 160 million people). In 2013, the top 10 most profitable corporations in America earned roughly one-seventh of the total profit that was earned by all businesses, and remember that when you count all of the dry cleaners and gas stations and CPAs and tutors and hair salons in America, you’re talking about tens of millions of businesses.

Therefore, the simple rule of thumb (n > 30 or n > 25, depending on your textbook) is not appropriate for the worlds of technology, banking, and finance. What should we call people who use the CLT or Gaussian models to perform economic risk analysis (think: the “geniuses” who gave us the catastrophe of 2008, which we are still digging out from)? Eight million people lost their jobs in the U.S. alone in the aftermath of that catastrophe, and some of the victims are still unemployed or underemployed today. Here are my suggestions for what to call the perpetrators: overconfident, arrogant, ignorant, and above all, overpaid.

Not a single one of them went to jail. The arrogance and overconfidence they exhibited are still with us, and the banking reforms that Congress implemented to try to prevent another similar catastrophe in the future have been less than successful. It’s quite likely that the whole cycle will be repeated again, on an even larger scale.

The quants (quantitative analysts) on Wall Street weren’t necessarily arrogant and overconfident, but the people who listened to them uncritically and ignored the associated caveats certainly were. An awful lot of the quantitative analysis involved in the overleveraged investments of the mid-00s was based on Gaussian models and the CLT.

There were other bad assumptions, too: assumption of independent events when computing risk, assumption that real estate appraisals and investment ratings were made in good faith (when frequently they were made with conflicts of interest), assumption of ability of lenders to repay loans, etc. There was also a good deal of abdication of due diligence in evaluating investments, not to mention outright fraud. But the misapplication of the CLT is right up there.

I would claim that there’s nothing inherently wrong with arrogance and overconfidence. Shoot, if we didn’t have arrogance and overconfidence, we wouldn’t make much progress as a species. Every significant advance requires someone wildly arrogant and overconfident (and, usually, incredibly lucky) to spearhead it.

The American way is that arrogant and overconfident people should be paid what they deserve. If they are taking extreme risks, they should lose their investment most of the time, and every once in awhile, they should have a spectacular success and earn a lot of money. That’s fair. That’s the way it should be.

My objection is that the arrogant and overconfident people who gave us the Great Recession of 2008 didn’t lose their shirts. Their income and wealth are up dramatically since 2008. Effective tax rates, including those resulting from the “temporary” Bush-era tax cuts, are at near-record lows for both wealthy individuals and corporations.

As for banking profits and corporate profits in general, they are both at all-time record highs. The reason? We, the taxpayers of America, bailed out AIG (and, by extension, Goldman Sachs) and the big banks when the whole system was approaching a total meltdown in 2008. You'd think that the least they could do for us would be to give the U.S. Treasury a hugely generous return on investment.

(Yes, yes, I know that the government ultimately made a good profit on the AIG bailout and the Fannie Mae/Freddie Mac overhaul, as well as the bailout of the “too big to fail” megabanks. But considering that none of those organizations would be employing anyone today if we had not bailed them out in 2008, it seems to me that we should be entitled to hundreds of billions or even trillions of dollars in compensation, not the paltry tens of billions we received. And the former head of AIG, Hank Greenberg, now has the gall to sue the U.S. Government for $25 billion, claiming that the bailout, the one that saved his company and kept his shares from being worth exactly $0, was illegal.)

We had no choice but to bail the scoundrels out in 2008. That’s right, we had no choice. We had to bail them out, because the alternative would have been a global financial crisis that would have made the Great Depression look like a summer breeze.

However, we don’t have to let history repeat itself. Fool me once, shame on you (though I apparently can’t send you to jail). Fool me twice, shame on me.

Friday, November 30, 2012

We Are the Miracles Other People are Praying For, or, Thoughts from the data recovery trenches

Recently (OK, it was at 4 a.m. today, so I was a little punchy after a lot of hard work) I had the transformative experience of being an agent for good for a change. A friend of mine, a woman in her late 80s who is, shall we say, not a "digital native," had managed to overwrite not only her life history in Microsoft Word but also the backup file. When I say "overwrite" I mean that she had probably pressed Ctrl+A followed by spacebar, thus converting 400-odd pages of carefully wordsmithed prose into a single space character. If she had pressed Ctrl+Z (Undo) at that point, or if she had closed the file without saving, I wouldn't have anything to blog about today. However, what she did was to save the file under its existing file name. Ulp. Then, 9 minutes later, apparently seized with panic, she saved the file again, thus destroying the automatic backup copy. Double ulp.

There is plenty of "undeletion" software to recover deleted files. Undeletion is simple, since the metadata (file name, time, date, etc.) are not destroyed. Unfortunately, that's not the situation I faced. When my friend saved her file under its existing name, all the metadata were transferred to the new file, and the old data were left in a state of limbo, drifting aimlessly in the hard disk's hundreds of gigabytes of unallocated space. To recover data from the unallocated space, one needs special "carver" software (I used Photorec) and a lot of patience.

When I first spoke to my friend yesterday, she was distraught and was relying on me to do something. The most recent partial backups I could locate were way too old to be useful, and though she had some hard copies, they were out of date and would have required many hours of painful retyping at a minimum. So, when I informed her today that I had, indeed, managed to recover something that looked very much like her 414-page memoir data file, she was in tears. From her point of view, it was a miracle!

Most people wouldn't call it a miracle. I knew exactly what I had done, and it wasn't rocket science. True, sending the hard drive out to a data recovery service might have cost a lot of money, but no single step had been particularly difficult or had required anything outside my skill set.

Then it hit me . . . just because I didn't think it was a miracle doesn't mean it wasn't a miracle. From my elderly friend's point of view, it most certainly was a miracle. I felt privileged and deeply humbled to have been placed in the right place at the right time to engineer her miracle.

And when a natural disaster hits people elsewhere in the world, or a personal crisis hits someone we know, we may find ourselves in a position of being able to answer their prayers.

We are the miracles that other people are praying for!

Monday, April 9, 2012

Everything We Need to Know about Millennials, or, "My unscientific rant on an entire generation"

Millennials tend to be

1. Not alert
2. Bite-sized, with everything at arm’s length
3. Addicted to context switching (do NOT say multitasking)
4. Largely lacking in curiosity
5. Deficient in written and oral communication skills


Footnotes:
=========
1. It’s not that Millennials have an attention deficit disorder or are incapable of paying attention. They pay close attention to live sporting events, for example. The problem is that the “Tivo culture” has trained them that almost nothing is worth paying attention to. Anything significant will be stored, archived, and endlessly replayed for them, so why bother paying attention the first time?

2. Millennials have little or no interest in anything “long-form.” YouTube videos are popular only if they are short. Movies are watched in snippets. Audio files are heard in snippets. Scripted TV shows (with plot, a narrative arc, a beginning, a middle, and an end) are of waning interest. Articles are usually read in snippets--rarely finished before being interrupted by a hyperlink jump or a distraction. “In-depth research” refers to anything beyond the first page of a Google hit list, and there is no point in trying to think of different search criteria; whatever one comes up with initially is probably good enough. Books are largely irrelevant, not because they are lacking in content, but mainly because it is not possible to copy-and-paste from a book. Most Millennials, if forced to attend a one-hour performance of a symphony (or even a rock opera) would feel compelled to tweet or text about the experience while it was happening--they couldn’t simply experience it. By the way, the arm’s-length approach to everything explains why texting is vastly more popular than chatting on the phone. Live, real-time interaction with a real human being takes one’s full attention if done properly, and who wants to be that involved? This leads to #3.

3. Ample experimental evidence exists to say that cognitive multitasking by humans is impossible. We have only one cognitive center. We can do multiple activities simultaneously (e.g., walking, breathing, and chewing gum) only when there are different regions of the brain responsible for those activities. Since there is only one cognitive center, we cannot engage in higher-order thinking about more than one thing at a time. Teenagers are better and faster at cycling their attention rapidly among activities (i.e., context switching) than adults are, but they are not actually multitasking. Context switching is less efficient than single-tasking and leads to less deep thought. Context switching while doing homework (e.g., IMing friends or Facebooking) ensures that homework is a superficial, chore-type activity, not something that has one’s full attention and engagement.

4. How does a CD player work? Answer: Nobody cares, we’ll buy a new one if it breaks.
How does a microwave oven work? Answer: You punch some buttons, and the food gets warmer.
How does a cell phone work? Answer: You talk into it, and people on the other end can hear you. But why talk? Texting is easier.
How does a PC work? Answer: You click on stuff. If that doesn’t work, double-click. Sometimes things go bad. When that happens, buy a new PC (or better yet, a Mac).
How did the Romans manage to engineer and build tunnels, bridges, roads, and aqueducts without modern earth-moving machinery or computers? Answer: It doesn’t matter, since we have all that stuff now. Besides, the Romans like went extinct like, what, a hundred years ago?

5. Written communication with Millennials is an acquired taste. Millennials, having been raised on texting, tend to avoid subject lines, capitalization, punctuation, honorifics, salutations, proper grammar, or closings and signatures. A minority of them can spell the word “tomorrow” correctly. Since IMs, texts, and e-mails are as numerous as the grains of sand on the beach, there is no need to obsess or worry about the wording of any message considered in isolation, or even to respond if one doesn’t feel like it. As for oral communication, many Millennials say “like” almost constantly. (Like, every third or fourth word.)


Good things about Millennials
=========================
There are also many good things about Millennials! Here is my list. Millennials tend to be . . .

- very trusting
- trustworthy
- opposed to smoking
- opposed to (or, more correctly, bewildered by) bigotry against gay people
- entrepreneurial
- much less car-oriented than previous generations
- somewhat more international and interethnic in their choice of friends
- open, not hypocritical (though sometimes they share too much!)
- less cynical, perhaps, than the Xers and the Boomers
- not bogged down by rules of etiquette or convention, since new conventions are being created all the time

Thursday, March 29, 2012

Highly recommended reading: "Stop Stealing Dreams"

Maybe you have already heard the buzz about Seth Godin's massive education-reform manifesto. Maybe you haven't. Either way, you really ought to read the entire thing. Here's the link: www.stopstealingdreams.com.

The only comment I can think of, off the top of my head, to add to Godin's magnum opus would be something like this:

134. Maybe it's because my first career was in business, not education, but this needs to be shouted: STUDENTS' TIME HAS A DOLLAR VALUE, TOO. IT'S NOT ONLY THE LABOR HOURS OF TEACHERS THAT COST MONEY. A school that organizes time and schedules for the convenience of adults and treats kids as if they should sit around, bored, most of the time . . . is a school that has no business surviving in the 21st century. Just as no businessperson would travel to a conference halfway across the country that consisted of nothing but non-interactive, large-group lectures, we shouldn't ask children to spend their precious time on this earth in so unproductive a venture. Seth Godin says that the original purpose of schools was to create compliant masses for the industrial economy and the consumer economy, and he says they did a reasonably good job of it. I disagree. Old-style schools do not breed compliance, they breed boredom and lethargy. They breed workers who think it's OK to loaf all day on the job, since they have been loafing from the ages of 6 through 18 in the public school system.

Germany has one of the most productive economies on earth, and Germans take five weeks or more of vacation per year. Americans are lucky to get a week or two. We Americans loaf in school, then scramble to make money by logging more hours on the job. If we worked harder in school, we could work smarter (not harder) in the workplace and have a much better quality of life. And maybe better beer, too.

I'm not saying that high-tech should replace schools. Far from it! I'm saying that education always has been, or at least should have been, a HIGH-TOUCH operation. Kids need lots of care and attention to grow up, and we should use all the high-tech tools we can possibly get our hands on to provide them with that HIGH-TOUCH personal care and attention, not factory-style instruction.

Sunday, November 28, 2010

Announcing . . . The Teachers' Rodeo! or, "If you fund it, they will come"

The other day, I was thinking . . . maybe, just maybe, Arne Duncan and the other high-profile education reformers (including the recently departed Michelle Rhee and the soon-to-be-departed Joel Klein) are right. Maybe it really is possible to identify the best teachers by looking at how much the test scores of the young people assigned to them improve.

If that is true, then surely we should have a competition to identify the best teachers at every level: school, district, region, state, the whole country. Just as the NFL and the NBA lure today’s schoolkids to spend their time practicing football and basketball, teachers’ rodeos would energize, identify, and glorify superstar teachers. President Obama should set aside a portion of the $4.35 billion Race to the Top (RTT) fund for the Race to the Top Rodeo.

My endorsements and prizes will bring
Big SUVs, houses, and bling.
Who cares for my pupils?
I laugh at your scruples.
Quantifiable achievement is king!

Forget for the moment that money and fame are not primary motivators for most teachers—or if they are, we definitely chose the wrong profession. The Race to the Top Rodeo (RTTR) will create a whole new category of teachers. Bolder! Cleverer! Faster and more efficient in every key area of student interaction!

Here’s how RTTR works. Each contestant chooses one or more events to enter. There will be . . .

  • Bull riding! Here, each teacher confronts a series of unmotivated students armed with state-of-the-art excuses. The winner is the teacher who can most effectively deflect all the excuses and persuade the students to listen attentively for a minimum of 9.00 seconds. If even one student sends a text message, shoots a spitwad, or asks to go to the bathroom during the 9-second window, the teacher is disqualified. Timing begins only after all excuses have been successfully defused to the satisfaction of the judges. Most observers agree that the 9 seconds that follow are even more dangerous than the 8 seconds allotted to professional rodeo bull riders.

  • Team roping! In this event, pairs of teachers, normally an idealistic rookie and a cynical old hand, compete to bring down a randomly chosen high school senior. Slacker students are preferred, with extra points awarded to any team that draws a student with a particularly “whatever” attitude. A functional MRI (fMRI) scanner reveals whether anything either teacher says or does manages to penetrate the student’s consciousness. Real-time monitoring by PETA ensures that no student is harmed by the scan or treated inhumanely. Any activity that resembles repetitive drill, even if it has educational content, is prohibited on the grounds of being old-fashioned.

  • Barrel racing! Here, randomly assigned students are dropped from the chute in classes of 35. Each teacher contestant has exactly 50 minutes to herd the group into part of the rodeo ring and teach them something they can remember long enough to increase their collective score on a standardized test by at least 10 points. (A proposed rule modification would reduce the time of the event to 12.5 minutes, which would not only improve audience focus but would also more accurately reflect the length of a real class period, after deducting for roll call, bathroom excuses, and general malarkey.) The winner is the teacher whose students achieve the greatest increase, and that teacher receives a generous raise for the next school year. Other teachers who meet the 10-point threshold are given the title of “minimally adequate teacher” and are placed on probation. The remaining contestants, those who fall short of the 10-point mark, are crushed by barrels, to the great amusement of the spectators. Students who participate in the barrel-flattening portion of the event are given iPods as a reward for good citizenship.

  • Whipcracking! Teachers are no longer allowed to compete in whipcracking. Contestants are now drawn exclusively from the ranks of top administrators and school chancellors. Students are removed from the arena during this event, lest their psyches be permanently damaged by the sound of a whip. Teachers who lost at earlier events are seated in the center of the rodeo ring and are forced to endure endless demonstrations of whipcracking prowess. (The losers from the barrel racing event are exempt, since most of them are too beaten up to sit.) Scoring in this event is generally ignored, since all whipcracking chancellors receive huge cash prizes, regardless of whether or not they have any ability. The most famous whipcracker in the arena is given the title of “Top Educator” and is allowed to take a ceremonial lap around the ring in a custom golf cart purchased with federal funds. This victory lap is called, of course, the Race by the Top (RBTT).

My biggest fear in listing these events is that someone will not realize that my tongue is planted firmly in my cheek and will get the idea of starting a real rodeo-style competition for teachers. It would not be that hard to pull off, actually—certainly less complicated than a conventional rodeo with all its horses and cattle and ropes and gates and gory injuries. Students could be randomly assigned to teachers and given safe, indoor competition events as part of their regular schooling.

And, to be completely serious for a moment, let’s acknowledge that spending a couple of billion dollars on local, regional, and national teachers’ rodeos would have a big impact. Big money works in professional sports, and a professional teachers’ rodeo could attract heavy bipartisan support. Just as students spend countless hours chasing stardom by honing their hook shots, teachers would vie to become extremely skilled in the various tasks that help their reluctant students accept what the government calls “education.” There would be some subjective scoring, just as in real rodeo, but the winners would be determined quantitatively, by competition. Remember, that is the absolute, bottom-line measure of goodness and effectiveness in a business enterprise.

What, you don’t think education is a business enterprise? Surely you have been under a rock for the last decade. Everyone nowadays knows (cough, cough) that K-12 education can and should use business-oriented measures of effectiveness, accountability, and achievement. There is no other way forward for America.

And the day that happens is the day that teachers’ last remaining shred of professional dignity wafts away on the breeze. We’re not quite there yet, thank heavens.

Sunday, August 22, 2010

More Voices Crying in the Wilderness, or, "Others with better credentials seem to agree with me"

In July I read Diane Ravitch's recent book, The Death and Life of the Great American School System: How Testing and Choice Are Undermining Education. Ravitch, a former assistant secretary of education, echoes many of the themes from my previous post (April 18, 2010).

Now, obviously, Ravitch didn't get any ideas from me, and even if she did read my blog (as if!), her book was sent to the publisher many months ago. My gratification in reading the book was merely that she had analyzed many of the same issues I had looked at, and she had arrived at the same general conclusion: In our haste to reform education, we will probably make things much, much worse.

In today's newspaper, Ravitch has written short reviews of three new books on education reform. I intend to read at least at least two of these books in the weeks ahead.

Dear reader, if you do not have the time to read the reviews linked above, please allow me to quote a portion of Ravitch's review of Linda Darling-Hammond's new book:

Darling-Hammond does something that the Obama administration has not: She reviews what the top-performing school systems around the world do to get great results. Their highest priorities, she shows, are building a strong, experienced staff and making sure that every school has access to a rich, well-balanced curriculum in the arts and sciences. Finland, the highest-performing nation, has not relied on testing and accountability to achieve its current status. [Washington Post, page B7, Aug. 22, 2010]

Are we Americans so arrogant that we can't take the time to learn from the Finnish success? To be sure, Finland is much more homogeneous than the U.S., and has a completely different culture, but the fact that the Finnish reforms have not been based on market-based approaches (read: bottom-line focus) should give us pause.