The Hechinger Report spent the last year examining a major subset of school discipline: suspensions and expulsions for vague and subjective categories such as insubordination, disruption and disorderly conduct.
We started this project with some basic questions. How often does the state suspend students for this reason? What types of behavior do educators say constitute defiance or disorder? And were some students more likely to be punished for this kind of thing than others?
Answering these questions will reveal how overwhelmingly common these types of stops are for a wide range of behaviors, including minor accidents. Here's how we did it:
How do we obtain state and local level suspension data?
Although we attempted to obtain data from all 50 states, there is no single place to obtain school discipline data broken down by suspension category. States do not report this information to the federal government. In fact, some states don't even collect it locally.
When available, data were downloaded from state Department of Education websites. When they are not readily available, we have filed public records requests.
For New Mexico, we used data obtained and published by ProPublica.
What did we ultimately collect?
Ultimately, the data we were looking for was from Alabama, California, Georgia, Indiana, Maryland, New Hampshire, New Mexico, Ohio, Vermont, Washington, Minnesota, Mississippi, Massachusetts, Alaska, Colorado, Louisiana, Montana, North Carolina, Oregon, and Rhode Island.
In most cases, we received data from 2017-18 to 2021-22. However, Vermont did not have data for 2021-22, and North Carolina only had data for 2019-2020 and 2020-2021.
We had demographic data available to examine racial and special education disparities in California, Indiana, Vermont, New Mexico, Montana, Maryland, Ohio, Rhode Island, Mississippi, and Massachusetts.
Was the data uniform?
Far from it. Each state has its own categories of student discipline, ranging from the six reasons a student can be suspended in California to the more than 80 reasons in Massachusetts.
First, we identified and screened all categories related to incivility, disorder, or disruption. This was the main focus of our analysis. But we also wanted to know if these reasons led to suspensions for other reasons as well.
To do this, we looked for common threads between suspension categories and created larger categories of our own. For example, crime categories involving alcohol, drugs, or tobacco were classified into the “Alcohol/Drugs/Tobacco” category. Any crime involving fighting or physical assault is placed under the category of “physical violence.” These groupings were achieved through a study of state disciplinary regulations and discussions. We also showed the group to experts to get their feedback. We ended up with 16 unique categories. We've added the numbers for every state category that falls into one of the larger groups.
This gave us an overall look at how many punishments were assigned for a wide range of types of behavior. However, direct comparisons between states are not advisable because each state has different definitions of the discipline.
stop… for what reason?
Hechinger research shows that students miss hundreds of thousands of school days each year due to subjective infractions such as insubordination and disorderly conduct.
Read Series
How did you handle missing or corrected data?
In all states, suspensions of fewer than a certain number (usually less than 10, but sometimes less than 5) have been modified so that the students cannot be identified. Since there was no way to accurately estimate that number, we assumed it to be zero. In most states, this did not affect the overall findings. In smaller states or regions that saw or expected significant revisions, we only looked at the grand totals.
Were there any other limitations to the data?
Yes, once again, we've had to contend with the problem that states are not uniform in how they collect this information. In some places I only got information about suspensions. In other cases, the data included expulsions. In Alabama, cases of corporal punishment and placement in alternative schools were also included.
Some states allow school districts to report only one reason for suspension. Others allow you to choose from multiple reasons. And to make things even more confusing, some states reported the number of students suspended, while others reported the number of incidents that resulted in suspensions. We've created a list with details for individual states.
How did you analyze demographic imbalances?
We calculated suspension rates by looking at how many students of a particular race were suspended per 100 students of that race in a state or district. Comparing the suspension rates of black and white students was done by dividing the suspension rate of the former by the suspension rate of the latter. For example, if black students were suspended at a rate of 4 per 100 black students in the state and white students were suspended at a rate of 2 per 100 white students, then black students were suspended at twice the rate of white students. Students (4/2 = 2).
We conducted the same analysis for students with disabilities compared to their general education peers.
How do I find out what behavior my students were suspended from?
We have submitted open records requests to dozens of school districts across the country requesting disciplinary records from the most recent year or two of suspensions assigned to insubordinate or disorderly conduct categories.
Most locations either denied or did not respond to our requests. Some people estimated that it would cost tens of thousands of dollars to retrieve the records. In total, 12 locations in 8 states accepted our request for free or at a lower cost. This provided over 7,000 disciplinary records to be analyzed.
So how did you analyze it?
After reading many records to begin to identify patterns, we once again created several broad categories of behavior that appeared over and over again, including talking back to the educator, swearing, or refusing direct commands.
About 1,700 records were in PDF format (including handwritten notes) that couldn't be easily converted to a spreadsheet. We coded all of this ourselves to see if the incident contained our categories and marked it with a yes or no. We also hand-coded 1,500 of the remaining records. Each event can have as many “examples” as possible. We checked each other's work to ensure consistency.
We then used a machine learning library, trained a model with the labeled dataset, and used the trained model to predict the remaining incident reports for the same categories. The model's accuracy in predicting incidence (on a test data set taken from the labeled data set) varied across categories, but overall the model had a low false positive rate. We also immediately checked our findings to ensure that records had not been misclassified.
This story about academic data was produced by: Hechinger Reportis a nonprofit, independent media outlet focused on inequality and innovation in education. Sign up Hechinger Newsletter.