Sunday, December 28, 2014

Assessing US Higher Education: Information, Intimidation, Ignorance, or Insanity?

The last post of Edunationredux offered a partial critique of the Obama/Duncan scheme to rate America's colleges and universities. Prior national critique reflected almost a "you gotta be kidding" ambience, illuminating the perceived chasm between what Arne Duncan and the US Department of Education are proposing, and anything resembling intelligent social science applied to the measurement task.  Today’s post extends the prior critique, exploring the real measurement chores needed to create valid and reliable ratings of America's colleges and universities.

That chasm between the proposal and reality is so great it raises major questions; what conceptual malaise and what leadership degradation have occurred in that Department, who is steering this measurement debacle, and what resources are executing the work.  Is the proposal chain-rattling just to get the attention of higher education leadership?  If the intent is to actually carry through the scheme, is this another Federal agency that has now lost steerage, and mismatched the resources needed to actually conduct competent education work?

Post Critique, Critique

One tiny slip in the pronouncement of a functionary in the Department of Education may have given away the naïveté and slanted thinking footing the current proposal:  One of the factors allegedly being considered was how to treat "improvement" as a variable, and presumably as a simple metric.  The statement infers that the designers of this scheme may see the assessment of our colleges and universities occupying the same conceptual space as improving test scores in a public school system.  There are likely a few community college-scale institutions, close to being simply extensions of high school level performance, where this may be applicable, but any resources knowledgeable about the functions within a major university would deservedly see this as bizarre.

A last retrospective issue is further scrutiny of the misguided proposal to use beginning salaries of graduating students as a basis for institutional assessment.  This component of the proposal has some serious logic issues.  Aside from the nearly impossible chore of equilibrating the professional destinations of students across institutions to create one valid metric (or even multiple metrics), and the cognitive error of relating quality to profession sought, a peek at the distributions of those starting salaries poses an even more daunting issue.  Starting salaries are not distributed normally, but are skewed to the high end. The overwhelming body of starting salaries is so constrained, the distribution leptokurtic, that little or any discrimination among most salaries attributable to institutions could be detected. 

A pretty cynical outcome of using the proposed metric(s) for salaries, aside from all other faults, is that success in that venue would come from maximizing an institution's output of petroleum engineers, and wiping out the education of all PreK-12 teachers.  If the underlying intent of this scheme is some social engineering to equalize higher education opportunity, and social and economic states, its extreme liberal designers need to go back to the drawing board, or better, acquire some higher education.

Fair Challenge

The classic, and legitimate challenge to last post's critique of what's proposed -- that it is a loser -- is provide a more effective system for assessing our institutions.  The remainder of this post takes a stab at that challenge.


The starting point in this quest is identical to every legitimate research effort since the Enlightenment:  What is the goal, what hypotheses are to be tested, what question or questions are being posed for answers; what is the universe from which measurements are sought; what are the variables or factors requiring measurement, and what are their functional relationships to the criterion question(s); what are the properties of the variables, in this instance measurements wanted, i.e., nominal, ordinal, interval, cardinal; what are the hypothesized or measurable distributions of measurements sought; how do the error terms intrinsic to all variables fall out, intra-institutional variance versus inter-institutional variance, driving the comparisons of institutions or institutional subsets sought; what are the weights of contributing variables in forming then informing about the differential effectiveness or qualities of institutions being assessed; and critically, with a finite set of candidates for positioning, how may the units in the universe need to be stratified or clustered to minimize confounding of results attributable to basically different higher education systems being appraised?

Given a US universe of 4,140 institutions of higher education, with internal partitioning that may multiply the actual units of analysis by levels of magnitude, with hypothetically complex variable sets driving the criterion effect, the project is not the simplistic vision of the US Department of Education, revolving around already extant data, but what is now colloquially termed "big data:" " all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications. The challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and privacy violations."  The mission here, assigning performance ratings to America's colleges and universities, is arguably the very definition of the analysis challenge described.

Department of Education thinking is apparently to measure some amalgam of institutional functional performance and contribution to social goals.  Both become subdivided into constituent goals that complicate what is proposed and currently measured:  For performance, institutional graduation rates overall versus by students' degree tracks, as well as longitudinally by how the process is finally achieved and the time involved; the learning effectiveness of what's been acquired along the way (made more complex when apportioned among multiple disciplines and degree tracks); the complexity of devising true costs of education delivered, plus the cogent issue of the productivity of all of the assets and operations incurred to produce a graduate; and close to the most salient first use of any assessment, whether the results actually materially impact via improvement the choice processes of prospects seeking higher education.  Also ignored in the Department's rhetoric, the longitudinal complexity of worth of prior learning at exit from the institution, versus its worth at the various career stages the graduate experiences.

Measurement Factor Complications

The performance of our institutions in creating equitable student access may be slightly easier to access in principle, but introduces major problems in execution:  A large multivariate causal set of determinants of schools screened, preceding the issue of differential institutional compliance with equitable admissions, is problematic; the reality that acceptance of those who might be discriminated is also based on the failures or successes of our public K-12 systems, long before an institution's action effecting equity kicks in; and a major barrier to measurement at the level of the individual student/family is driven by confidentiality considerations.

A pre-collegiate experience case in point, familial relationship to this writer, is a collegiate freshman at a major university, majoring in an engineering specialty.  Partially because of the 9-12 work in an effective science high school, this soon-to-be second semester freshman will be moving into second semester sophomore level academic work with perfect "A" grades, primed by the prior high school work.  Adding to the analysis challenge of assessing institutional performance, then, are the assets/deficits that precede and impact acceptance.  The remedial work impeding, or prior learning permitting accelerated collegiate work, becomes another complication in assessing collegiate end-game contribution.

Another set of factors in judging performance is the subjectivity of protocols of collegiate grading, variable among institutions, among schools, among departments, and even among individual faculty.  Without some national, standardized achievement testing, by specific disciplines or academic track of students, the comparative use of even grades and point averages as measures of institutional performance add complexity to any rating scheme.

The prior Edunationredux blog also unfolded another major constraint, comparison of institutions based on the proper unit of analysis as well as assuring comparability, rendering the simplistic measurement chore inferred in the Obama/Duncan thinking the height of amateurism. 

Still another factor ignored in the current conceptualization is the role played by geographic and location factors, perhaps even highly specific location factors related to the population and cultural composition surrounding a student's residential assignment, influencing institutional outcomes.

But there is another gut issue that will at present -- and in the absence of never executed benchmark research on our colleges/universities -- blind side and hamstring the proposal.  That is the core pattern of variance of any variable or factor used as a basis of measurement.  In virtually all diversified and complex systems (precisely what every major college/university is) there is leveling of outputs based on de facto competition.  In common sense terms, there may be more variation of performance within an organization, than among similar organizations, where an attempt is made to sum or average overall experience.  The practical significance, with a small bit of coaching, human experts on higher education can likely identify the better or worse extremities of “high performing” and "low performing" colleges/universities.  The in-the-middle thousands may blur because their performances tend to regress to each stratum's universe mean.  Consider that in the last half century no credible college or university has been put out of business because their outputs were wholly without merit, or their graduates could not acquire employment.

Rank Versus Supply Real Information for Choice

The commercially hyped collegiate rating schemes -- U.S. News, Forbes, Princeton, and et al. -- have been widely criticized for their simplistic foundations, and the reality that they are minimal discrimination of a complex product.  But they, along with such counter productive ratings of “best party school,” are still allegedly used for input to a critical life decision, an American tragedy.  That prompts the leading question:  Is the Obama/Duncan strategy embodied in the proposed rankings one of the worst decisions of this administration, matching or exceeding even the core ignorance of present punitive-based testing in public K-12?  Would far better choices have been, for example, the long view with strategic research to field a legitimate comprehensive rating scheme for our institutions’ multidimensional areas of performance, call it the 'value-rating' model; or a non-punitive and affirmative alternative 'value-choice' model, the mission, providing comprehensive valid and comparable information on all public higher education institutions, letting the user supply their own criteria for use of the information for choice of school? 

Both example approaches start with the same research roots:  A priori judgments of the factors considered central to the quality and equity of higher education delivered, irrespective of whether those factors are presently quantified; next the development work is executed to convert those multidimensional factors, by algorithm or by scaling techniques to create digital metrics for factors.  At this point the approaches bifurcate, value-choice becoming the issue of creating easily accessible and universal databases, placing them in "the cloud" readily available online, searchable via criteria pertinent to the individual collegiate wannabe, or in another possible form as the material for use of simulation to derive optimal choices for a student.  The rest of our real world is inundated with clever "apps," available for even the ubiquitous smart phone.  Publicly accessible digitally, online, the system offers at low or no cost the structured information to personally search possible school choices.  The values or experiences available from a candidate school remain the elections of the potential student and parents, not predetermined by big brother.

The second approach -- value-rating -- does carry out the intent of the Obama/Duncan vision, ordinal rating of institutions, but based on the constituent properties of collegiate value delivery noted for the first approach.  What changes, what additional research is needed?  One model for the second approach might be structured as follows:  The starting point is a quota sample from America's colleges/universities serving as the development base, the sample reflecting meaningful categorizations of our institutions; for factors presumed causal for quality and equitable delivery by an institution, break out programs or tracks that constitute legitimate units of analysis; use a "human expert model" of decision making to create criterion positioning of the sample organizations, for the unit of analysis, by the various factors; then the goodness of fit is tested between metrics devised and expert positioning of all factors/units of analysis, mathematically determining the salience and weighting of factors that fit expert prediction.  Lastly, the metrics proving predictive are tested on a second comparable sample of our institutions for verification.

There are already out there, in the mass of college/university data banked on institutions' web sites, and made available in detail by a plethora of both public and private sector organizations, the raw data to start building either of the above approaches.  Most of our institutions are working with their own game plans, but the composite of data generated could be a starting point, for example, for building a universal higher education database serving the value-choice approach.  A tragedy of our present society is that a Bill Gates, instead of funding programs designed to beat on our public schools with testing, apparently lacked the perspicacity to pursue even his own suite of digital experiences to fund and guide the assembly of a suitable higher education national database?

Can the value-ranking model actually be executed?  It is arguable that it already has been in part, that the logic employed by Tom Peters and his associates in creating the corporate effort, In Search of Excellence, is an early precursor to that approach; it stopped short of seeking to quantify determinants of excellence, but the core idea was successful.  Using the power of that same Federal funding to our colleges/universities serves as an incentive to engage our universities in needed research.  That is a far better use of the incentive than seeking to intimidate our institutions into change by ranking linked to punitive reductions in funding.  Lastly, you are developing metrics that are defined by the real measurement challenge, and not by what was developed for other purposes or is simply convenient.


Historically, toward the end of last century, one of the Presidential Commissions on Higher Education offered the White House and our higher education community very practical recommendations.  They encompassed reducing higher education costs, reforming funding of tuition and other costs of a degree, and cooperation among all of our post-secondary schools to adopt a common set of parameters making available to America's families uniform ways to assess collegiate choice.  Both our college/university leaderships, and our political system quickly rejected all three sets of well-reasoned recommendations.  Clearly, moving either of the above approaches, or anything resembling them to a productive destination would require some new mindsets, among our higher education institutions, and in Federal education leadership's sensitivity to genuine national needs over liberal dreaming.

Counterpoint is that some of our colleges and universities, presumably “reading the room,” have already initiated innovative changes in their collegiate instruction.   Reported in Saturday’s New York Times, changes are occurring in B-schools’ MBA programs -- to emulate the rapidity of change and experimentation from Silicon Valley – and in basic collegiate science courses to move from lecture modes to high student involvement and problem solving.  Long valid patterns of diffusion of innovation will change higher education, even as the critically deficient Obama/Duncan rating scheme is stumbling out of the starting gate.  Perhaps merely the threat of that Federal ‘Franken data’ has stimulated collegiate action?  Incredibly cynical albeit clever if true; but if accurate the rest of program should be given a quick burial.

On real inspection the proposed Department of Education rating scheme regardless of intentions simply reeks of ignorance and flawed understanding of both complex academic organizational behavior, of advanced learning, and of the most basic principles of inquiry and social science explanation.  Their scheme could, analogically, be compared to trying to build a quantum computer using some AA batteries, a photo transistor, a couple of resistors/capacitors, and some wire scrounged from the ties used on garbage bags.  The present scheme, even if Machiavellian, as well as mirroring the mental set that any solution has to be punitive, is wholly unworthy of a Federal education function critical to our nation, and is condemnation of the current resources managing that agency.


The next issues of Educationredux will move into challenges and opportunities throughout US higher education that might be areas for measured change along with possible innovations.  First out of the chute will be the footers for more productive higher education experiences -- bridging the chasm between our K-12, especially 9-12 school outputs, and the incoming requirements for collegiate success -- allowing passage through collegiate work with greater learning effect, in shorter periods of time, and therefore with less investment.

No comments:

Post a Comment