Science Benchmarking Report TIMSS 1999–Eighth Grade
Introduction Contents
  •What is TIMSS 1999 Benchmarking?
 
 
 
 
 
 
 
 

 

 

© 2001 International Association for the Evaluation of Educational Achievement (IEA)

 

 

 

 

 

 

Introduction

Over the last decade, many states and school districts have created content and performance standards targeted at improving students’ achievement in mathematics and science. In science, most states are in the process of implementing new standards or revising existing ones.(1) All but four states now have content or curriculum standards in science.(2) Much of this effort has been based on work done at the national level during this period to develop standards aimed at increasing the science literacy of all students. The two most prominent documents are the American Association for the Advancement of Science (AAAS) Benchmarks for Science Literacy and the National Research Council’s National Science Education Standards (NSES), both of which define standards for the teaching and learning of science that many state and local educational systems have used to fashion their own curricula.(3)

Particularly during the past decade, there has been an enormous amount of energy expended in states and school districts not only on developing science content standards but also on improving teacher quality and school environments as well as on developing assessments and accountability measures.(4) Participating in an international assessment provides states and school districts a global context for evaluating the success of their policies and practices aimed at raising students’ academic achievement.

What Is TIMSS 1999 Benchmarking?

TIMSS 1999, a successor to the 1995 Third International Mathematics and Science Study (TIMSS), focused on the mathematics and science achievement of eighth-grade students. Thirty-eight countries including the United States participated in TIMSS 1999 (also known as TIMSS-Repeat or TIMSS-R). Even more significantly for the United States, however, TIMSS 1999 included a voluntary Benchmarking Study. Participation in the TIMSS 1999 Benchmarking Study at the eighth grade provided states, districts, and consortia an unprecedented opportunity to assess the comparative international standing of their students’ achievement and evaluate their mathematics and science programs in an international context. Participants were also able to compare their achievement with that of the United States as a whole, and in the cases where they both participated, school districts could compare with the performance of their states.

Originally conducted in 1994-1995,(5) TIMSS compared the mathematics and science achievement of students in 41 countries at five grade levels. Using questionnaires, videotapes, and analyses of curriculum materials, TIMSS also investigated the contexts for learning mathematics and science in the participating countries. TIMSS results, which were first reported in 1996, have stirred debate, spurred reform efforts, and provided important information to educators and decision makers around the world. The findings from TIMSS 1999, a follow-up to the earlier study, add to the richness of the TIMSS data and their potential to have an impact on policy and practice in mathematics and science teaching and learning.

Twenty-seven jurisdictions from all across the nation, including 13 states and 14 districts or consortia, participated in the Benchmarking Study (see Exhibit 1). To conduct the Benchmarking Study, the TIMSS 1999 assessments were administered to representative samples of eighth-grade students in each of the participating districts and states in the spring of 1999, at the same time and following the same guidelines as those established for the 38 countries.

In addition to testing achievement in mathematics and science, the TIMSS 1999 Benchmarking Study involved administering a broad array of questionnaires. TIMSS collected extensive information from students, teachers, and school principals as well as system-level information from each participating entity about mathematics and science curricula, instruction, home contexts, and school characteristics and policies. The TIMSS data provide an abundance of information making it possible to analyze differences in current levels of performance in relation to a wide variety of factors associated with classroom, school, and national contexts within which education takes place.

Why Did Countries, States, Districts, and Consortia Participate?

The decision to participate in any cycle of TIMSS is made by each country according to its own data needs and resources. Similarly, the states, districts, and consortia that participated in the Benchmarking Study decided to do so for various reasons.

Primarily, the Benchmarking participants are interested in building educational capacity and looking at their own situations in an international context as a way of improving mathematics and science teaching and learning in their jurisdictions. International assessments provide an excellent basis for gaining multiple perspectives on educational issues and examining a variety of possible reasons for observed differences in achievement. While TIMSS helps to measure progress towards learning goals in mathematics and science, it is much more than an educational Olympics. It is a tool to help examine such questions as:

How demanding are our curricula and expectations for student learning?

Is our classroom instruction effective? Is the time provided for instruction being used efficiently?

Are our teachers well prepared to teach science concepts? Can they help students understand science?

Do our schools provide an environment that is safe and conducive to learning?

Unlike in many countries around the world where educational decision making is highly centralized, in the United States the opportunities to learn science derive from an educational system that operates through states and districts, allocating opportunities through schools and then through classrooms. Improving students’ opportunities to learn requires examining every step of the educational system, including the curriculum, teacher quality, availability and appropriateness of resources, student motivation, instructional effectiveness, parental support, and school safety.

Which Countries, States, Districts, and Consortia Participated?

Exhibit 1 shows the 38 countries, 13 states, and the 14 districts and consortia that participated in TIMSS 1999 and the Benchmarking Study.

The consortia consist of groups of entire school districts or individual schools from several districts that organized together either to participate in the Benchmarking Study or to collaborate across a range of educational issues. Descriptions of the consortia that participated in the project follow.

Delaware Science Coalition. The Delaware Science Coalition (DSC) is a coalition of 15 school districts working in partnership with the Delaware Department of Education and the business-based Delaware Foundation for Science and Mathematics Education. The mission of the DSC is to improve the teaching and learning of science for all students in grades K-8. The Coalition includes more that 2,200 teachers who serve more than 90 percent of Delaware’s public school students.

First in the World Consortium. The First in the World Consortium consists of a group of 18 districts from the North Shore of Chicago that have joined forces to bring a world-class education to the region’s students and to improve mathematics and science achievement in their schools. Resulting from meetings of district superintendents in 1995, the consortium decided to focus on three main goals: benchmarking their performance to educational standards through participating in the original TIMSS in 1996 and again in 1999; creating a forum to share the vision with businesses and the community of benchmarking to world-class standards; and establishing a network of learning communities of teachers, researchers, parents, and community members to conduct the work needed to achieve their goal.

Fremont/Lincoln/Westside Public Schools. The Fremont/Lincoln/Westside consortium is comprised of three public school districts in Nebraska. These districts joined together specifically to participate in the TIMSS 1999 Benchmarking Study.

Michigan Invitational Group. The Michigan Invitational Group is a heterogeneous and socioeconomically diverse group composed of urban, suburban, and rural schools across Michigan. Schools invited to participate as part of this consortia were those that were using National Science Foundation (NSF) materials, well-developed curricula, and provided staff development to teachers.

Project SMART Consortium. SMART (Science & Mathematics Achievement Required For Tomorrow) is a consortium of 30 diverse school districts in northeast Ohio committed to continuous improvement, long term systemic change, and improved student learning in science and mathematics in grades K-12. It is jointly funded by the Ohio Department of Education and the Martha Holden Jennings Foundation. The schools that participated in the project represent 17 of the 30 districts.

Southwest Pennsylvania Math and Science Collaborative. The Southwest Pennsylvania Math and Science Collaborative, established in 1994, coordinates efforts and focuses resources on strengthening math and science education in the entire southwest Pennsylvania workforce region that has Pittsburgh as its center. Committed to gathering and using good information that can help prepare its students to be productive citizens, the Collaborative is composed of all 118 “local control” public districts, as well as the parochial and private schools in the nine-county region. Several of these districts are working together in selecting exemplary materials, developing curriculum frameworks, and building sustained professional development strategies to strengthen math and science instruction.

What Is the Relationship Between the TIMSS 1999 Data for the United States and the Data for the Benchmarking Study?

The results for the 38 countries participating in TIMSS 1999, including those for the United States, were reported in December 2000 in two companion reports – the TIMSS 1999 International Science Report and the TIMSS 1999 International Mathematics Report.(6) Performance in the United States relative to that of other nations was reported by the U.S. National Center for Education Statistics in Pursuing Excellence.(7) The results for the United States in those reports, as well as in this volume and its companion mathematics report,(8) were based on a nationally representative sample of eighth-grade students drawn in accordance with TIMSS guidelines for all participating countries.

Because having valid and efficient samples in each country is crucial to the quality and integrity of TIMSS, procedures and guidelines have been developed to ensure that the national samples are of the highest quality possible. Following the TIMSS guidelines, representative samples were also drawn for the Benchmarking entities. Sampling statisticians at Westat, the organization responsible for sampling and data collection for the United States, worked in accordance with TIMSS standards to design procedures that would coordinate the assessment of separate representative samples of students within each Benchmarking entity. For the most part, the U.S. TIMSS 1999 national sample was separate from the students assessed in each of the Benchmarking jurisdictions. Each Benchmarking participant had its own sample to provide comparisons with each of the TIMSS 1999 countries including the United States. In general, the Benchmarking samples were drawn in accordance with the TIMSS standards, and achievement results can be compared with confidence. Deviations from the guidelines are noted in the exhibits in the reports. The TIMSS 1999 sampling requirements and the outcomes of the sampling procedures for the participating countries and Benchmarking jurisdictions are described in Appendix A. Although taken collectively the Benchmarking participants are not representative of the United States, the effort was substantial in scope involving approximately 1,000 schools, 4,000 teachers, and 50,000 students.

How Was the TIMSS 1999 Benchmarking Study Conducted?

The TIMSS 1999 Benchmarking Study was a shared venture. In conjunction with the Office of Educational Research and Improvement (OERI) and the National Science Foundation (NSF), the National Center for Education Statistics (NCES) worked with the International Study Center at Boston College to develop the study. Each participating jurisdiction invested valuable resources in the effort, primarily for data collection including the costs of administering the assessments at the same time and using identical procedures as for TIMSS in the United States. Many participants have also devoted considerable resources to team building as well as to staff development to facilitate use of the TIMSS 1999 results as an effective tool for school improvement.

The TIMSS studies are conducted under the auspices of the International Association for the Evaluation of Educational Achievement (IEA), an independent cooperative of national and governmental research agencies with a permanent secretariat based in Amsterdam, the Netherlands. Its primary purpose is to conduct large-scale comparative studies of educational achievement to gain a deeper understanding of the effects of policies and practices within and across systems of education.

TIMSS is part of a regular cycle of international assessments of mathematics and science that are planned to chart trends in achievement over time, much like the regular cycle of national assessments in the U.S. conducted by the National Assessment of Educational Progress (NAEP). Work has begun on TIMSS 2003, and a regular cycle of studies is planned for the years beyond.

The IEA delegated responsibility for the overall direction and management of TIMSS 1999 to the International Study Center in the Lynch School of Education at Boston College, headed by Michael O. Martin and Ina V.S. Mullis. In carrying out the project, the International Study Center worked closely with the IEA Secretariat, Statistics Canada in Ottawa, the IEA Data Processing Center in Hamburg, Germany, and Educational Testing Service in Princeton, New Jersey. Westat in Rockville, Maryland, was responsible for sampling and data collection for the Benchmarking Study as well as the U.S. component of TIMSS 1999 so that procedures would be coordinated and comparable.

Funding for TIMSS 1999 was provided by the United States, the World Bank, and the participating countries. Within the United States, funding agencies included NCES, NSF, and OERI, the same group of organizations supporting major components of the TIMSS 1999 Benchmarking Study for states, districts, and consortia, including overall coordination as well as data analysis, reporting, and dissemination.

What Was the Nature of the Science Test?

The TIMSS curriculum frameworks developed for 1995 were also used for 1999. They describe the content dimensions for the TIMSS tests as well as the performance expectations (behaviors that might be expected of students in school science).(9) Six content areas were covered in the TIMSS 1999 science test. These areas and the percentage of the test items devoted to each are earth science (15 percent), life science (27 percent), physics (27 percent), chemistry (14 percent), environmental and resource issues (nine percent), and scientific inquiry and the nature of science (eight percent). The performance expectations include understanding simple information (39 percent), understanding complex information (31 percent), theorizing, analyzing, and solving problems (19 percent), using tools, routine procedures, and science processes (seven percent), and investigating the natural world (four percent).

The test items were developed through a cooperative and iterative process involving the National Research Coordinators (NRCS) of the participating countries. All of the items were reviewed thoroughly by subject matter experts and field tested. Nearly all the TIMSS 1999 countries participated in field testing with nationally representative samples, and the NRCS had several opportunities to review the items and scoring criteria. The TIMSS 1999 science test contained 146 items representing a range of science topics and skills.

About one-fourth of the questions were in the free-response format, requiring students to generate and write their answers. These questions, some of which required extended responses, were allotted about one-third of the testing time. Responses to the free-response questions were evaluated to capture diagnostic information, and some were scored using procedures that permitted partial credit. Chapter 2 of this report contains 20 example items illustrating the range of science concepts and processes covered in the TIMSS 1999 test. Appendix D contains descriptions of the topics and skills assessed by each item.

Testing was designed so that no one student took all the items, which would have required more than three hours of testing time. Instead, the test was assembled in eight booklets, each requiring 90 minutes to complete. Each student took only one booklet, and the items were rotated through the booklets so that each item was answered by a representative sample of students.

How Does TIMSS 1999 Compare with NAEP?

The National Assessment of Educational Progress (NAEP) is an ongoing program that has reported the science achievement of U.S. students for some 30 years. TIMSS and NAEP were designed to serve different purposes, and this is evident in the types of assessment items as well as the content areas and topics covered in each assessment. TIMSS and NAEP both assess students at the eighth grade, and both tend to focus on science as it is generally presented in classrooms and textbooks. However, TIMSS is based on the curricula that students in the participating countries are likely to have encountered by the eighth grade, while NAEP is based on an expert consensus of what students in the United States should know and be able to do in science and other academic subjects at that grade. For example, TIMSS 1999 appears to place more emphasis on the physical sciences (physics and chemistry) than does NAEP, while NAEP appears to distribute its focus more equally among physical science, earth science, and life science.(10)

Whereas NAEP is designed to provide comparisons among and between states and the nation as a whole, the major purpose of the TIMSS 1999 Benchmarking Study was to provide entities in the United States with a way to compare their achievement and instructional programs in an international context. Thus, the point of comparison or “benchmark” consists primarily of the high-performing TIMSS 1999 countries. The sample sizes were designed to place participants near the top, middle, or bottom of the TIMSS continuum of performance internationally, but not necessarily to detect differences in performance among different Benchmarking participants. For example, all 13 of the participating states performed similarly in science in relation to the TIMSS countries – in the upper half of the international distribution of results. As findings from the NAEP assessment in 2000 are released, it is important to understand the differences and similarities in the assessments to be able to make sense of the findings in relation to each other.

How Do Country Characteristics Differ?

International studies of student achievement provide valuable comparative information about student performance, instructional practice, and curriculum. Accompanying the benefits of international studies, though, are challenges associated with making comparisons across countries, cultures, and languages. TIMSS attends to these issues through careful planning and documentation, cooperation among the participating countries, standardized procedures, and rigorous attention to quality control throughout.(11)

It is extremely important, nevertheless, to consider the TIMSS 1999 results in light of countrywide demographic and economic factors. Some selected demographic characteristics of the TIMSS 1999 countries are presented in Exhibit 2. Countries ranged widely in population, from almost 270 million in the United States to less than one million in Cyprus, and in size, from almost 17 million square kilometers in the Russian Federation to less than one thousand in Hong Kong SAR and Singapore. Countries also varied widely on indicators of health, such as life expectancy at birth and infant mortality rate, and of literacy, including adult literacy rate and daily newspaper circulation. Exhibit 3 shows information for selected economic indicators, such as gross national product (GNP) per capita, expenditure on education and research, and development aid. The data reveal that there is great disparity in the economic resources available to participating countries.

One fundamental way in which countries can differ is the way in which science instruction is organized at the eighth grade. In some countries science at the eighth grade is taught as a single general or integrated subject, while in others it is taught as separate science subjects, namely earth science, biology, physics, and chemistry. The majority of countries teach science at the eighth grade as a single integrated subject, although in many countries, particularly the European ones, it is common practice to teach science as separate subjects. In the U.S. it is more common to teach science at the eighth grade as a single subject. Exhibit 5.1 in the curriculum chapter details for each country and Benchmarking participant the science subjects offered up to and including the eighth grade.

How Do the Benchmarking Jurisdictions Compare on Demographic Indicators?

Together, the indicators in Exhibits 2 and 3 highlight the diversity of the TIMSS 1999 countries. Although the factors the indicators reflect do not necessarily determine high or low performance in science, they do provide a context for considering the challenges involved in the educational task from country to country. Similarly, there was great diversity among the TIMSS 1999 Benchmarking participants. Exhibit 4 presents information about selected characteristics of the states, districts, and consortia that took part in the TIMSS 1999 Benchmarking Study.

As illustrated previously in Exhibit 1, geographically the Benchmarking jurisdictions were from all across the United States, although there was a concentration of east coast participants with six of the states and several of the districts and consortia from the eastern seaboard. Illinois was well represented, by the state as a whole and by three districts or consortia – the Chicago Public Schools, the Naperville School District, and the First in the World Consortium. Several other districts and consortia also had the added benefit of a state comparison – the Michigan Invitational Group and Michigan, Guilford County and North Carolina, Montgomery County and Maryland, and the Southwest Pennsylvania Math and Science Collaborative and Pennsylvania.

As shown in Exhibit 4, demographically the Benchmarking participants varied widely. They ranged greatly in the size of their total public school enrollment, from about 244,000 in Idaho to nearly four million in Texas among states, and from about 11,000 in the Michigan Invitational Group to about 430,000 in the Chicago Public Schools among districts and consortia.

It is extremely important to note that the Benchmarking jurisdictions had widely differing percentages of limited English proficient and minority student populations. They also had widely different percentages of students from low-income families (based on the percentage of students eligible to receive free or reduced-price lunch). Among states, Texas had more than half minority students compared with less than one-fifth in Idaho, Indiana, and Michigan. Among the school districts, those in urban areas had more than four-fifths minority students, including the Chicago Public Schools (89 percent), the Jersey City Public Schools (93 percent), the Miami-Dade County Public Schools (93 percent), and the Rochester City School District (84 percent). These four districts also had very high percentages of students from low-income families. In comparison, Naperville and the Academy School District had less than one-fifth minority students and less than five percent of their students from low-income families.

Research on disparities between urban and non-urban schools reveals a combination of factors, often interrelated, that all mesh to lessen students’ opportunities to learn in urban schools. Students in urban districts with high percentages of low-income families and minorities often attend schools with higher proportions of inexperienced teachers.(12) Urban schools also have fewer qualified teachers than non-urban schools. In reviewing the U.S. Department of Education’s 1994 Schools and Staffing Survey, Education Week prepared a 1998 study on urban education that found that urban school districts experience greater difficulty filling teacher vacancies, particularly for certain fields including science, and that they are more likely than non-urban schools to hire teachers who have an emergency or temporary license.(13) Studies of under-prepared teachers indicate that such teachers have more difficulty with classroom management, teaching strategies, curriculum development, and student motivation.(14) Teacher absenteeism is also a more serious problem in urban districts. An NCES report on urban schools found they have fewer resources, such as textbooks, supplies, and copy machines, available for their classrooms.(15) It also found that urban students had less access to gifted and talented programs than suburban students. Additionally, several large studies have found urban school facilities to be functionally older and in worse condition than non-urban ones.(16)

How Is the Report Organized?

This report provides a preliminary overview of the science results for the Benchmarking Study. The real work will take place as policy makers, administrators, and teachers in each participating entity begin to examine the curriculum, teaching force, instructional approaches, and school environment in an international context. As those working on school improvement know full well, there is no “silver bullet” or single factor that is the answer to higher achievement in science or any other school subject. Making strides in raising student achievement requires tireless diligence in all of the various areas related to educational quality.

The report is in two sections. Chapters 1 through 3 present the achievement results. Chapter 1 presents overall achievement results. Chapter 2 shows international benchmarks of science achievement illustrated by results for individual science questions. Chapter 3 gives results for the six science content areas. Chapters 4 through 7 focus on the contextual factors related to teaching and learning science. Chapter 4 examines student factors including the availability of educational resources in the home, how much time they spend studying science outside of school, and their attitudes towards science. Chapter 5 provides information about the curriculum, such as the science included in participants’ content standards and curriculum frameworks as well as the topics covered and emphasized by teachers in science lessons. Chapter 6 presents information on science teacher preparation and professional development activities as well as on classroom practices. Chapter 7 focuses on school factors, including the availability of resources for teaching science and school safety.

Each of chapters 4 through 7 is accompanied by a set of reference exhibits in the reference section of the report, following the main chapters. Appendices at the end of the report summarize the procedures used in the Benchmarking Study, present the multiple comparisons for the science content areas, provide the achievement percentiles, list the topics and processes measured by each item in the assessment, and acknowledge the numerous individuals responsible for implementing the TIMSS 1999 Benchmarking Study.

next section >


1 Glidden, H. (1999), Making Standards Matter 1999, Washington, DC: American Federation of Teachers.
2 Key State Education Policies on K-12 Education: 2000 (2000), Washington, DC: Council of Chief State School Officers.
3 Smith, T.A., Martin, M.O., Mullis, I.V.S., and Kelly, D.L. (2000), Profiles of Student Achievement in Science at the TIMSS International Benchmarks: U.S. Performance and Standards in an International Context, Chestnut Hill, MA: Boston College.
4 Orlofsky, G.F. and Olson, L. (2001), “The State of the States” in Quality Counts 2001, A Better Balance: Standards, Tests, and the Tools to Succeed, Education Week, 20(17).
5 TIMSS was administered in the spring of 1995 in northern hemisphere countries and in the fall of 1994 in southern hemisphere countries, both at the end of the school year.
6 Martin, M.O., Mullis, I.V.S., Gonzalez, E.J., Gregory, K.D., Smith, T.A., Chrostowski, S.J., Garden, R.A., and O’Connor, K.M. (2000), TIMSS 1999 International Science Report: Findings from IEA’s Repeat of the Third International Mathematics and Science Study at the Eighth Grade, Chestnut Hill, MA: Boston College; Mullis, I.V.S., Martin, M.O., Gonzalez, E.J., Gregory, K.D., Garden, R.A., O’Connor, K.M., Chrostowski, S.J., and Smith, T.A. (2000), TIMSS 1999 International Mathematics Report: Findings from IEA’s Repeat of the Third International Mathematics and Science Study at the Eighth Grade, Chestnut Hill, MA: Boston College.
7 Gonzales, P., Calsyn, C., Jocelyn, L., Mak, K., Kastberg, D., Arafeh, S., Williams, T., and Tsen, W. (2000), Pursuing Excellence: Comparisons of International Eighth-Grade Mathematics and Science Achievement from a U.S. Perspective, 1995 and 1999, NCES 2001-028, Washington, DC: National Center for Education Statistics.
8 Mullis, I.V.S., Martin, M.O., Gonzalez, E.J., O’Connor, K.M., Chrostowski, S.J., Gregory, K.D., Garden, R.A., and Smith, T.A. (2001), Mathematics Benchmarking Report, TIMSS 1999 – Eighth Grade: Achievement for U.S. States and Districts in an International Context, Chestnut Hill, MA: Boston College.
9 Robitaille, D.F., McKnight, C.C., Schmidt, W.H., Britton, E.D., Raisen, S.A., and Nicol, C. (1993), TIMSS Monograph No. 1: Curriculum Frameworks for Mathematics and Science, Vancouver, BC: Pacific Educational Press.
10 Nohara, D. (working paper 2001), A Comparison of Three Educational Assessments: NAEP, TIMSS-R, and PISA, Washington, DC: National Center for Education Statistics.
11 Appendix A contains an overview of the procedures used. More detailed information is provided in Martin, M.O., Gregory, K.A., and Stemler, S.E., eds., (2000), TIMSS 1999 Technical Report, Chestnut Hill, MA: Boston College.
12 Mayer, D.P., Mullens, J.E., and Moore, M.T. (2000), Monitoring School Quality: An Indicators Report, NCES 2001-030, Washington, DC: National Center for Education Statistics.
13 Quality Counts 1998, The Urban Challenge: Public Education in the 50 States, Education Week, 17(17).
14 Darling-Hammond, L. and Post, L. (2000), “Inequality in Teaching and Schooling: Supporting High Quality Teaching and Leadership in Low-Income Schools” in R. Kahlenberg (ed.), A Notion at Risk: Preserving Public Education as an Engine for Social Mobility, Century Foundation Press.
15 Lippman, L., Burns, S., and McArthur, E. (1996), Urban Schools: The Challenge of Location and Poverty, NCES 96-184, Washington, DC: National Center for Education Statistics.
16 Lewis, L., Snow, K., Farris, E., Smerdon, B., Cronen, S., Kaplan, J., and Greene, B. (2000), Condition of America’s Public School Facilities: 1999, NCES 2000-032, Washington, DC: National Center for Education Statistics; School Facilities: America’s Schools Report Differing Conditions (1996), GAO/HEHS-96-103, Washington, DC: U.S. General Accounting Office.

Click here to return to the ISC homepage

TIMSS 1999 is a project of the International Study Center
Boston College, Lynch School of Education