Research

Pilot Findings: Misconceptions in Portland-Metro Classrooms

After 8 weeks with three pilot schools in the Portland metro area, our data surfaced patterns that surprised even experienced teachers. Here is what we found about 7th and 8th grade algebra readiness.

Oct 14, 2025 Jessica Tanaka 10 min read

Data visualization representing pilot findings

These are preliminary findings from Brainpathio's first pilot cohort. The data described here comes from eight weeks of use across three schools in the Portland metropolitan area, involving seven STEM teachers and approximately 340 students in grades 7 and 8. We are sharing this data in the spirit of the same transparency we committed to at pilot launch: the honest version, including the parts that did not go as expected.

A few important caveats before the findings: this is a small sample, drawn from a geographically concentrated area, during a single unit in a single school year. We are not claiming these patterns generalize to all 7th and 8th grade classrooms. We are claiming they are real patterns from real classrooms, and that some of them are interesting enough to warrant attention from teachers and curriculum coordinators who work in similar contexts.

What the Data Covered

The pilot focused on two content areas: 7th grade proportional reasoning (aligned to CCSS 7.RP) and 8th grade linear equations and expressions (aligned to CCSS 8.EE). Teachers used Brainpathio as the primary problem set delivery mechanism for one unit in each class. The system tracked response patterns across approximately 2,400 problem-set sessions, generating misconception alerts when student response patterns crossed detection thresholds.

At the end of the pilot, we provided each participating teacher with a misconception summary report — a breakdown of which misconception types were detected at what frequency across their classes. We also conducted structured debrief interviews with each teacher to understand which alerts they had acted on, which they had not, and why.

Finding 1: The Prevalence of Variable-as-Label Confusion Entering 8th Grade

The single finding that generated the most surprised reactions in teacher debrief interviews was the frequency of variable-as-label confusion in 8th grade students entering the linear equations unit. Our detection algorithm flagged this pattern — treating variables as abbreviated labels for specific objects rather than unknowns that can take any value — in approximately 28% of students in the 8th grade cohort.

This is higher than we expected, and higher than informal teacher estimates going into the pilot. When we shared this figure with the three 8th grade math teachers in the cohort, all three described it as "more than I would have guessed" — two put their informal estimate at closer to 10-15%. One teacher, who had been teaching 8th grade math for eleven years, said she recognized the misconception when she saw it flagged but had not been tracking it explicitly: "I knew some kids struggled with variables, but I thought that was mostly a few students, not a quarter of the class."

The implications are significant for pacing decisions. Variable-as-label confusion entering 8th grade means that a non-trivial fraction of students are attempting to learn equation solving — a procedural skill that only makes sense if you understand what solving for a variable means — without the foundational conceptual model in place. Standard equation-solving practice delivered to these students is working against a headwind that cannot be resolved by more practice.

Finding 2: The Proportional Reasoning Split in 7th Grade

In the 7th grade proportional reasoning unit, the most common error pattern was the "constant difference" error — extending a proportional relationship additively rather than multiplicatively. When presented with a table showing that a car travels 60 miles in 1 hour and 120 miles in 2 hours, and asked how far it travels in 5 hours, students with this error type correctly identify the pattern in the first two entries (adding 60 each time) and then apply the same additive rule to produce 180 miles for 3 hours, 240 for 4, and 300 for 5 hours.

In this case, the answer is actually correct — the additive pattern works because it is proportional. But the detection algorithm flagged these students because their response patterns on more complex problems (non-unit rates, fractional relationships) showed they were applying the additive rule in situations where it failed. The students were getting the right answers for the wrong reasons, which is one of the harder detection problems in misconception diagnosis.

The second notable pattern in the 7th grade data was more expected but still striking in magnitude: approximately 34% of students in the proportional reasoning unit showed at least one instance of part-whole ratio confusion — treating a part-to-part ratio as a part-to-whole fraction in computation tasks. This is a well-documented misconception in the research literature, and its frequency in our data aligns with ranges reported in academic studies of ratio and proportion understanding. Teachers were less surprised by this one than by the variable-as-label findings.

Finding 3: Teachers Acted on Roughly Half the Alerts They Received

In debrief interviews, teachers reported acting on roughly half the alerts they received — meaning they changed something about their instruction, provided targeted feedback to a specific student, or followed up with additional diagnostic conversation. The other half of alerts were seen and not acted on.

This is important data about what the system actually accomplished in classroom conditions, not just what it detected. We asked teachers why they had not acted on certain alerts. The answers clustered into three categories: not enough time in the current unit to change course ("we were already past that topic"); uncertainty about whether the alert was accurate ("I wasn't sure if this was a real pattern or a fluke answer"); and lack of a clear instructional response ("I didn't know what to do with this information").

The first category is partly a structural constraint that we cannot solve from the software side — pacing pressure is real. The second category is a calibration problem we can address by improving the confidence indicators and adding more explanation to alert descriptions. The third category is the most actionable and informed our decision to prioritize building intervention suggestions into the next development cycle.

Finding 4: Detection Confidence Improved Over Time — But Slowly

The misconception detection engine is designed to accumulate confidence as it gathers more response data per student. Early in the pilot, detection alerts had lower confidence ratings, reflecting limited data. By the end of week four, confidence ratings had increased substantially for students who used the system consistently.

The complication is that student usage was uneven. In two of the three schools, a subset of students — ranging from 15-25% — had session completion rates below 50%, meaning the engine never accumulated enough data to reach high-confidence detection thresholds for those students. These were often the students most likely to have underlying misconceptions — lower engagement and higher error rates tend to co-occur. The detection gap in the least-engaged students is a real limitation in the current design.

What We Are Taking Forward

The pilot data has directly shaped three product decisions for the next development cycle.

First, we are adding intervention suggestion content for the five most frequently detected misconception types: variable-as-label confusion, constant-difference proportional reasoning errors, part-whole ratio confusion, whole-number interference in fractions, and inverse operation confusion in equation solving. Each will have a mapped set of suggested 5-minute reteach activities designed for the classroom conditions our pilot teachers described.

Second, we are redesigning the alert presentation to include a plain-language explanation of what the detected pattern likely means for instruction, not just the misconception label. "Variable-as-label confusion" means something to a teacher who has encountered this in research; it means less to a teacher encountering the label for the first time under time pressure.

Third, we are building a low-completion fallback — a shorter diagnostic probe sequence designed to generate useful signal even from students who complete only 30-40% of a session. Perfect coverage from perfect completion rates is not realistic in most classrooms. The system needs to be useful in real conditions.

We are grateful to the teachers and students who participated in this cohort. What we learned from eight weeks in three Portland-area schools has changed the product substantially, and we believe it has made the research questions we are asking sharper. The next cohort opens in January 2026 — details at our pilot program page.