Preface
Somewhere down the longitudinal trail a
brave historian is going to ask the question: How could a generally
literate society not stand up and protest the destruction of real learning in
its public K-12 schools: By a U.S.
Secretary of Education observing learning through a straw, while his chain is
being yanked by Bill Gates; by a cabal of socially irresponsible testing
corporations; by a generation of for-profit psychometricians operating in corporate
bubbles; and by states’ political infrastructure putting ideology ahead of a
nation’s next generations of citizens?
The companion more
localized question is, how could our collegiate schools of education, and
virtually entire local K-12 public education establishments, express that
ignorance, or be sycophant to alleged reform pivoting on a naïve form of
testing?
If you believe that
improving U.S. public K-12 education goes beyond alleged standardized
multiple-choice testing, and are a fan of Malcolm Gladwell (The Tipping
Point), take solace in some recent developments.
Straws in the Wind
Unknown to many
Americans, the U.S. emulated the UK in pushing standardized testing, notably
because of Margaret Thatcher’s advocacy in the 1980s. Other countries then emulated those U.S. public primary and
secondary education strategies.
Specifically, the use of standardized testing has been prevalent in
Israel and in other parts of Europe, and long enough for its efficacy to be
assessed.
Domestically,
California has now voted to drop most K-12 standardized testing. New York State has now stated it will
reduce its use of that testing, and critics in that State are advocating even
broader cuts. Given that CA and NY
are generally where trends begin, perhaps we are seeing approach of that
“tipping point?”
Internationally, Scotland
discontinued that testing in 2003.
Israel has now discontinued the testing. Wales recently rescinded most standardized testing, and the
reasons are notable:
“What do Welsh teachers use instead of the
tests? With government guidance, teachers come up with their own assessments
and report the results to parents, local education authorities, and the Welsh
government each year. Freed from the need to prepare students for narrow tests,
secondary school teachers employ out-of-school experiences, in-depth research,
and presentations, emphasizing applied learning in secondary school and
underscoring the importance of play in early childhood education.
Brian
Lightman, head teacher at a secondary school outside Cardiff, Wales, helped
pilot some of the new approaches and is impressed with the results. ‘Our
students now are so much more independent and capable of organizing and
analyzing what they're doing, and they're able to improve as a result of that,’
he said. ‘They are very different in the way they go about their learning.’”
Only the U.S.
appears still fully in the grip of something close to mass hysteria – or
perverse dedication at our state levels to extreme conservative school
ideologies executed pretty much without critical thought. Unfortunately, this tunnel vision
extends all the way to Arne Duncan, and one step beyond to the gross hypocrisy
of President Obama. Mr. Obama, in
virtually every speech touching public education, churns out the right words
about the need for K-12 understanding and the limitations of present testing
obsessions, but then blesses the actions of Duncan and the U.S. Department of
Education doubling-down on testing imposition.
With the above bits
and pieces suggesting emerging challenge of the standardized testing orgy, why
is it so deeply entrenched? By the ignorance
and political extremism of current Republican state education bureaucracies and
legislators? Or by the combination of
naïveté and cowardice of too many of our public schools in America’s
“Pleasantvilles,” lacking intellectually and managerially competent
administration and better training than being turned out by our collegiate schools of education? Or by generations
of parents, products of the same systems, lacking criteria other than local ego, splendor of physical plant, and sports obsessions to guide local schools? Or notably, because of frequent election to local school
boards of members lacking the competence or experience to provide oversight, or
in some cases those with other agendas.
Award-winning NYC
principal Carol Burris, in a recent piece in The Washington Post’s “The
Answer Sheet,” offered another perspective that addresses the top down issues:
"What is equally disconcerting is that these reforms are being
pursued with little or no evidentiary grounding. There is, for instance, zero
sound research that demonstrates that if you raise a student’s score into the
new proficiency range, the chances of the student successfully completing
college increases. New York’s new cut scores are an attempt to benchmark state
scores to the proficiency rates attached to the National Assessment of
Educational Progress, or, NAEP. Yet the connections between NAEP scores and
college performance are so spurious that researchers have yet to claim
that NAEP scores have any predictive value at all when it comes to college and career
readiness."
"The bottom line is that there are tremendous financial interests
driving the agenda about our schools — from test makers, to publishers, to data
management corporations — all making tremendous profits from the chaotic
change. When the scores drop, they prosper. When the tests change, they
prosper. When schools scramble to buy materials to raise scores, they prosper.
There are curriculum developers earning millions to created scripted lessons to
turn teachers into deliverers of modules in alignment with the Common Core (or
to replace teachers with computer software carefully designed for such
alignment). This is all to be
enforced by their principals, who must attend calibration events run by network
teams.”
Obfuscation and
Myths
Normally this would
be the place to launch a spirited defense of those challenging standardized
testing. If the audience is
educationally literate that isn’t an issue. The limitations of alleged standardized multiple-choice tests
have been well documented for decades.
What is troubling is that multiple empirical studies of their
limitations were prominent in the U.S., by respected academic institutions,
before NCLB was launched by the Bush Administration. Even more studies and critique were available before the
bureaucratically-driven debacle of RTTT was launched with billions of dollars
in bribes to state governments. If
our Congressional Republicans wanted to plant a scandal-bomb under the White
House, it might better be shaped to open and reveal RTTT’s waste rather than
Benghazi or ACA.
A second bit of
misdirection is the classic tactic of attacking the critic, in the case of
present reform, with the usually smirking question: Why are you against testing; don’t we have to have some way
of holding our tax-supported public schools accountable? One would suspect that genre might be
smart enough to know the answer is both out there, and has been in place for
most of the tenure of public education.
Rejecting present dominance of that testing has zero to do with the need
to assess. Of course assessment of
many types, and testing of many flavors are needed, and have been the process material
of public K-12 excellence for over a century.
A third topic that
may not be visible in these debates is the checkered history of the
hero/villain of present reform, the ubiquitous multiple-choice test. Surprise to even many educators, the
multiple-choice test format, next year, will mark its 100th
birthday; precursors existed at the beginning of that century.
The multiple-choice
test was created by Frederick J. Kelly, a byproduct of his doctoral
dissertation at Emporia State University (formerly Kansas State Teachers’
College). Allegedly he was
motivated by the desire to eliminate the subjectivity of teachers’ judgments at
the time and to acquire “uniform results.” The approach was perceived as “…the assembly-line model of
dependability and standardization.”
Kelly was
circumspect about the testing model.
Attributed to him, a quote about the model that today would generate a
Twitter firestorm of political correctness. Kelly said:
“This is a test of lower order thinking for the lower orders.” The testing was commissioned by the
Army in WWI to evaluate recruits, but the story is not unexpectedly not quite
that simple:
“Most of us have experienced a
multiple-choice test. Our children undergo them, you've certainly taken them,
your parents probably did, and for some, even their grandparents had to endure
them. All of us have given them the power to decide our destiny. But what most
of us do not know is that multiple-choice tests resulted from an attempt to
legitimize the field of psychology, with a dash of xenophobia and scientific
racism. Stephen Jay Gould spells out the dark past of these tests in his aptly
titled book The Mismeasure of Man.
This highly recommended read reveals all the gory details of IQ testing. Gould
explains that the development of IQ testing was used to identify
feeble-mindedness in ‘unwanted’ groups (usually determined by race or country
of origin).
Multiple-choice tests had their origin in
World War I, when Dr. Robert Yerkes, President of the American Psychological
Association (APA), convinced the Army to commission them to test the
intelligence of recruits. The Army's goal was to improve the efficiency of
evaluating men by moving away from time-consuming written and oral
examinations. Yerkes' motives were to make psychiatry a more scientific field
and move it away from its affiliation with philosophy.
A
total of 1.7 million recruits were tested, giving the multiple-choice test an
air of legitimacy, but in the end, the Army found no value in the results.
Yerkes omitted that part of the story when he sold this idea to educational
testing outfits. The validity of the test was not questioned. The rest is an
unfortunate history.”
The ironic
conclusion to Professor Kelly’s odyssey:
“A few years later, as President of the University
of Idaho, Kelly disowned the idea, pointing out that it was an appropriate
method to test only a tiny portion of what is actually taught and should be
abandoned. The industrialists and the mass educators revolted and he was
fired.”
The story gives
further reflective meaning to the old saw, no good deed goes unpunished in
American society.
The last piece of
the puzzle about our testing trajectory has not been well aired, that is, the
role that the psychometric subset of psychology has played in creating the
present mess. The field is focused
on the construction and validation of measurement instruments, including tests
and personality instruments.
Attributed as launch of a discipline to Sir Francis Galton (1822-1911),
the field developed with some distinction from the latter part of the 19th
century through the present. The
field is not a household word, and even surveying the history is well beyond
this post. There are two key points,
however, that at least at the level of a chapter title merit comment.
In this century
psychometric modeling and math were greatly extended, applicable to present standardized
testing design through item analysis. Item analysis is a class of analysis that broke out of
relative obscurity when our testing companies and their vended tests were
identified as the source of large scale or unexpected shifts in results of state
testing. There is little question
that item analysis is valid as a mechanism for identifying testing that
discriminates human responses and can create gradients and clustering.
Point one is that the
models could be applicable in any measure of a human population that displays discrete gradients of performance on some set of attributes. The models say nothing about the concept validity of the
property being measured. In sum,
as sophisticated as the techniques for deriving present test components, the
results have no intrinsic claim to measuring understanding. Thus to the extent that much of present
testing cannot be linked to clear statements of how test scores explain high
order thinking and understanding, the use of test results as the definition of
whether testing is of value is pure tautology and not a basis for claiming
reform.
Secondly, there is even greater harm in the role our testing companies have assumed, with some hubris, designating what is knowledge, without transparency. Massively pervasive standardized testing, driving out of classrooms traditional attention to critical thought, de facto by that testing defines a nation's first 12 years of formative knowledge. Psychometric input deserves its provenance as expertise on selective test creation. The creators and keepers of knowledge have been excluded, public education disavowed interest in content over a half century ago becoming classroom mechanics, and by default the nation's knowledge is now being devised by amateurs in all except selective disciplines. That would not appear to project a bright future for our national intellect?
Secondly, there is even greater harm in the role our testing companies have assumed, with some hubris, designating what is knowledge, without transparency. Massively pervasive standardized testing, driving out of classrooms traditional attention to critical thought, de facto by that testing defines a nation's first 12 years of formative knowledge. Psychometric input deserves its provenance as expertise on selective test creation. The creators and keepers of knowledge have been excluded, public education disavowed interest in content over a half century ago becoming classroom mechanics, and by default the nation's knowledge is now being devised by amateurs in all except selective disciplines. That would not appear to project a bright future for our national intellect?
So, How Assess?
Somewhat cynically,
I suspect the shadow version of this question is, how assess our students’
performances without working too hard?
If that is the basis for much of public education’s slavish acceptance
of present standardized testing, we have indeed evolved a pretty sick public
education system.
If instead, the
basis is that our education community simply knows no better, then a recent
article referencing educator training, by The New York Times’ writer
Bill Keller (“An Industry of Mediocrity”) merits your reading.
A third possibility
is that our public school systems have been so intimidated (or bought off) by
Federal initiatives, by state controls, or their school boards and
administrators are too fearful to actually operate on the basis of
communities’ desires for local education control. The answer then is at the
ballot box, if a community’s school board representation by free election
hasn’t already been rigged by incumbents, one of the key sources of local
public school corruption of democratic process. The clues aren’t hard to identify; try a ballot with three
candidates, vote for three.
Democratic process in action, or election fraud?
Assessment that
evolves from teaching that recognizes and emphasizes understanding and learning
is hardly a mystery. Here is
simply a topic list of some proposed assessment methods:
- Classic Socratic questioning
- Mastery learning
- Project application of constructs
- Student progress reports (a’ la Gardner or Boyer)
- Performances
- Authentic assessment (usually with PBL)
- Embed standardized tests as pragmatism
- SCALE (Stanford Center for Assessment, Learning and Equity)
- Old fashioned quizzes
- Product related outputs
- Process related outputs
- Writing, essays!
- Use authentic audiences for demonstration of performance
- Role-based (in PBL)
- The flipped classroom engaging parents
- Use Bloom’s and Marzano’s taxonomies
- Dynamic testing (integrated with teaching)
- Indirect assessment (with formative and summative assessment)
- Interactive analysis
- Mathematical thinking
-Fault finding and fixing
-Plausible estimation
-Creating measures
-Convincing and proving
-Reasoning from evidence
- Conceptual diagnostic tests
- Attitude surveys
- Concept mapping
- Exhibitions
- Portfolios
- Self- and peer-evaluation
- Gaming outcomes
- Simulations assessing performance
- Artificial intelligence (expert systems)
- Longitudinal performance tracking
In a recent
communication, master educator and author Dr. Marion Brady (who was inventing education
before most of you were born) proposed an assessment philosophy that he
commented may not be ready for prime time. I believe it is “just in time”
if you will pardon my reversion to a prior profession. Marion’s take:
“The reform cart is in front of the horse. Its
initial assumption is faulty. The aim isn’t to teach the core subjects well,
but to rear smart kids. If I’m right, then the first step in a proper reform
effort is creating tests. Tests first, not last—tests that evaluate what
Einstein said should be our first priority—the ability to imagine alternative
futures and deal with the problems those futures create.
That done, tell teachers to teach to have at it. If
it’s thought that standards are needed, let teachers write them, but keep them
in electronic form so they can continuously evolve as professional dialogue
expands expertise.
.
.
.
My evaluation-related assumptions: (1) Evaluation
tasks should require kids to apply what they know in a not-previously studied
situation; (2) the best tasks are concrete rather than abstract, real-world
rather than theoretical, ‘supra-disciplinary’ rather than tied to a single
school-subject; (3) there’s no good reason for a test to be timed; (4) a good
task requires no security measures, no honor code, no anti-plagiarizing
strategy, no vigilant watching for evidence of cheating. The response to a good
task will be so idiosyncratic that any teacher in charge of a reasonably-sized
class for more than a very few weeks will know who wrote what.
Many years ago, when I first read Alfred North
Whitehead’s 1916 Presidential Address to the Mathematical Association of
England, I was mystified by his insistence that ‘no educational system is
possible unless every question directly asked of a pupil at any examination is
either framed or modified by the actual teacher of that pupil in that subject.’
It took me many more years to see the wisdom in
that requirement. Now, I can see no acceptable alternative.”
Bottom Lines
As a way of summing,
consider this quote from a high school student featured in the Lucas
Educational Foundation site, “edutopia:”
“And yet, in the world of education, the "next
big thing" is merit pay for teachers and boosting test scores. Do our
policymakers not understand that the world is going through a revolution in the
way we live, interact and learn?
Our education system is stuck in paralysis. We have
tried doing the same thing over and over again with the expectation of a
different result. This is insanity at its finest. The way we educate is based
on the tenets of the Industrial Revolution -- conformity and standardization.
For instance, creativity is virtually extinguished
as a child goes through his or her schooling. In their 1998 book Breakpoint and
Beyond, George Land and Beth Jarman refer to a study in which 1,500
kindergartners between three and five years old were given a divergent thinking
test. Divergent thinking tests don't measure creativity, but rather one's
propensity for creativity. The test asks questions such as ‘How many ways could
you use this paperclip?’ or ‘How many ways could you improve this toy fire truck?’
-- questions designed to encourage creative thought rather than elicit
right-or-wrong answers. Ninety-eight percent of kindergarteners tested at
genius level. The kids were tested every few years. By the end of
post-secondary education, only two percent of students tested at genius level.
So, if you're trying to produce compliant,
dead-brained, formulaic workers, our system is doing exactly what it was
designed for. (I should add ‘grade-obsessed’ to that cadre of properties.) But
in a society where innovation is simply everything, it is a cultural and moral
failure to encourage this compliance.”
There is when all
denial is purged, and when all preconceptions and pretensions are deflated,
still the belief that there is some ‘magic sauce’ that will transform a U.S.
public K-12 education system that lost its will to excel and its capacity for
servant leadership some decades ago. When confronted with examples of
Finland’s comparable systems, or Singapore’s, or Shanghai’s, and their
successes relative to the U.S., the prototypical domestic response is some form
of “but they’re smaller, or more homogenous, or more socialistic.” The
savvy observer of our society, and NYT writer and author Tom Friedman, recently
visited Shanghai in search of the ‘magic sauce;’ there isn’t any, but there is
a master class lesson in K-12
education.
Part 4