Return to search form  

Session Title: Supporting Value Judgments in Evaluations in the Public Interest
Panel Session 901 to be held in Pacific A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Government Evaluation TIG and the Presidential Strand
George Julnes, University of Baltimore, gjulnes@ubalt.edu
Michael Morris, University of New Haven, mmorris@newhaven.edu
Stephanie Shipman, United States Government Accountability Office, shipmans@gao.gov
Abstract: To better understand valuing in the public interest, it is important to encourage a dialogue among evaluators in government and related organizations. This session provides presentations from Francois Dumaine describing Canadian evaluations, Martin Alteriis discussing GAO evaluations, and Christina Christie and Anne Vo presenting a model of the role of evaluators in valuing.
The Evaluator's Role in Valuing: Who and with Whom
Anne Vo, University of California, Los Angeles, annevo@ucla.edu
Christina Christie, University of California, Los Angeles, tina.christie@ucla.edu
Evaluation scholars and practitioners have dedicated much energy and effort to shaping and defining the program evaluation profession. However, careful examination of the program evaluation literature turns up only a few resources that describe and operationalize value judgments, the ways in which they are reached, and who is involved in this aspect of the evaluation process. We argue in this paper that the valuing act may be perceived in many different ways and consider the multiple theoretic perspectives that govern an evaluator's behavior. Based on this analysis, we develop a typology of evaluator valuing roles and suggest that value judgments are typically reached by stakeholders alone, stakeholders and evaluators in consort with each other, or by evaluators only. This heuristic helps us to gain a more explicit understanding of the valuing act and process as it occurs in the context of an evaluation.
Is Playing Hitman the Right Role for Evaluation?
Francois Dumaine, PRA Inc, dumaine@pra.ca
The process is secretive, which naturally fuels the fears of bureaucrats. Simply labeled "Program Review", this initiative from the Canadian government forces each department to assess, on a cyclical basis, all its activities, and identify which ones must go. No way around it: at each review, at least five percent of current spending must be freed up in order to be reinvested. To either guard or target initiatives, evaluation reports have become a prominent tool. And thanks to a revamped evaluation policy, all activities within each department shall be evaluated on a cyclical basis, providing ample evidence when engaging in a program review. Not surprisingly, as its role shifts, and its actions become more consequential, program evaluation is being scrutinized. This presentation uses the program review experiment to explore the set of assumptions about public policy that drive program evaluation, and assess its impact on the function of evaluation.
Using Criteria to Assess Agency Monitoring and Evaluation: Recent Government Accountability Office (GAO) Assessments of U.S. Foreign Assistance Programs
Martin De Alteriis, United States Government Accountability Office, dealteriism@gao.gov
Some recent GAO engagements have examined the ways in which U.S. federal government agencies monitor and evaluate particular programs and activities. For example, in the area of U.S. foreign assistance, GAO has looked at the State Department's evaluation of certain public diplomacy programs, USAID's monitoring and evaluation (M&E) of its Food for Peace program, and USDA's M&E of its McGovern-Dole school feeding program. These engagements examined factors such as the: performance measures used, reporting of results, M&E policies and procedures, staff devoted to M&E, and evaluations conducted. This paper will focus on the types of criteria that were employed in those engagements. For example, GAO has used its own performance measurement standards and AEA Evaluation Policy Task Force 'Road Map' standards. This paper will also discuss some of the challenges that can arise when using these sources of criteria, such as the challenge of operationalizing concepts such as 'Independence.'

Session Title: The Value of Organizational Modeling of Evaluation Protocols and Standards From the State and National Level in Extension
Multipaper Session 902 to be held in Pacific B on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Extension Education Evaluation TIG
Karen Ballard,  University of Arkansas, kballard@uaex.edu
Understanding the Practice of Evaluation in Extension: Test of a Causal Model
Alexa Lamm, University of Florida, alamm@ufl.edu
Glenn D Israel, University of Florida, gdisrael@ufl.edu
Abstract: Extension funding comes from local, state, and federal dollars; therefore the primary driver for evaluation is accountability for public funds. Historically, evaluation has been considered a necessary component in Extension rather than a priority. As public budgets get cut, the need for Extension to demonstrate programmatic public value is increasing. The ability to provide credible information depends on evaluation activities. The purpose of this research was to examine how organizational Extension evaluation structures directly and indirectly influence evaluation behaviors of Extension professionals. Data was collected from Extension professionals in eight states to examine how their perceptions of organizational evaluation factors influenced their evaluation behaviors. The results show changes at multiple levels can affect evaluation behavior. Extension leaders can impact evaluation practices by changing their own behavior, establishing a social culture within the system supportive of evaluation, and placing an emphasis on skill training in evaluation.
Evaluating for Value: A New Direction for Youth Program Evaluation
Mary Arnold, Oregon State University, mary.arnold@oregonstate.edu
Melissa Cater, Louisiana State University AgCenter, mcater@agcenter.lsu.edu
Abstract: Evaluating youth development programs, such as 4-H, has received considerable attention in the past 10 years. Establishing best practices for youth program evaluation, especially for the evaluation of small local programs, remains a perennial and multifaceted concern. This paper presents a brief history of youth program evaluation, and concludes that many youth-serving programs lack the resources to conduct comprehensive, rigorous, experimental studies. Then, drawing on recent advances in the literature related to youth program evaluation the authors argue for focusing more on three current areas of promising youth program evaluation practice: 1) the evaluation of program implementation and program quality; 2) building the evaluation capacity of program staff that are often charged with conducted evaluations; and 3) engaging youth in the evaluation of programs that affect them through youth participatory evaluation.
Evaluating Impact at the Systems-Level: Implementing the First Cross-Site Evaluation Within the Children, Youth, and Families At Risk (CYFAR) Initiative
Lynne Borden, University of Arizona, bordenl@ag.arizona.edu
Christine Bracamonte Wiggs, University of Arizona, cbmonte@email.arizona.edu
Amy Schaller, University of Arizona, aschalle@email.arizona.edu
Abstract: Programs currently exist in an atmosphere where they need to remain relevant and responsive to their target populations, funders, and policy makers. In an effort to document performance accountability, demonstrate impact, and promote sustainability, many funders require their programs to complete a common measure (cross-site) assessment. This paper will highlight the implementation of a common set of measures within Children, Youth, and Families At Risk (CYFAR) system, funded by the National Institute of Food and Agriculture (NIFA), United States Department of Agriculture (USDA). This paper will discuss the common evaluation measures used to collect aggregate-level data, the process of collecting cross-site data, and preliminary findings from the first round of data collection. The paper will also address key considerations for building systems-level evaluation capacity, including how the incorporation of technology can support these efforts.

Session Title: Fidelity Checks and Interpretation in Producing Evaluation Value
Multipaper Session 903 to be held in Pacific C on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Quantitative Methods: Theory and Design TIG
M H Clark,  University of Central Florida, mhclark@mail.ucf.edu
Evaluators Use of Questionnaires and Score Interpretation
Randall Schumacker, University of Alabama, rschumacker@ua.edu
Abstract: Evaluators often use survey research methods to collect data on attitudes, perceptions, satisfaction, or opinions when evaluating persons or programs. Questionnaires use response scales for sets of items, however, most scales yield ordinal data, e.g. (SA, A, N, D, and SD), or have limited numerical range, e.g. 1 to 5 (Alreck & Settle, 2004). How variables are measured or scaled influences the type of statistical analyses we should conduct (Anderson, 1961; Stevens, 1946). Parametric statistics are typically used to analyze the survey data, however, without meeting the statistical assumptions. Rasch rating scale analysis of questionnaire responses can produce continuous measures from the ordered categorical responses. The Rasch rating scale analysis provides for the interpretation of the rating scale category effectiveness. Rasch person logits can be used in a linear transformation formula to produce scaled scores that range from 0 to 100, thus providing meaningful score interpretation and statistical analysis.
Values in Practical Evaluation: The Development and Validation of a Self-Reported Measure of Program Fidelity
Dennis Johnston, AVID Center, djohnston@avidcenter.org
Philip Nickel, AVID Center, pnickel@avidcenter.org
Abstract: Advancement Via Individual Determination (AVID) is a college preparatory system with the mission of preparing all students for college and career readiness. As AVID grew to over 4,000 schools, the need became apparent for a measure of program implementation fidelity. AVID staff developed the Certification Self-Study (CSS) measure to assist schools to implement AVID and to provide the AVID Center with information necessary to monitor the quality, consistency, and fidelity of AVID programs. This paper examines the values inherent in the process of developing this measure and the psychometric evaluation of the CSS. Results indicate that each subscales met sufficient levels of internal consistency and sites with higher levels of implementation fidelity evidenced higher outcomes. Discussion includes the value of using psychometrically validated measures, the educational values inherent in the AVID program, and how the method of CSS data collection reflects those values.
Using Measures of Implementation to Enhance the Interpretability of Experiments
Mark Hansen, University of California, Los Angeles, markhansen@ucla.edu
Abstract: The analyses presented here are based on a study of an innovative high school curriculum and related teacher professional development activities. The curriculum focused on improving students' reading comprehension. A variety of measures were used to assess the extent to which teachers implemented the prescribed curriculum. The extent to which students utilized various reading strategies emphasized within the curriculum was also examined. Treatment effects were estimated using multilevel models (students nested within classrooms). Following the approach of Hulleman and Cordray (2009), we calculated indices of treatment fidelity across the study groups for both teacher and student variables. Finally, we investigated how treatment strength may have affected inferences about effectiveness. There was evidence of a positive treatment effect on literacy instruction but no significant effects on student implementation (utilization of reading strategies) or student outcome variables. However, there appear to be positive relationships between student implementation variables and student outcomes.

Session Title: Increasing Knowledge and Use of Evaluation in the Nonprofit Sector
Multipaper Session 904 to be held in Pacific D on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Non-profit and Foundations Evaluation TIG
Michelle Baron,  The Evaluation Baron LLC, michelle@evaluationbaron.com
Building a "Super" Logic Model: Developing a System of Tiered Logic Models to Identify Key Outcomes in a Large Nonprofit Organization
Susan Connors, University of Colorado, Denver, susan.connors@ucdenver.edu
Joyce Schlose, Goodwill Industries of Denver, jschlose@goodwilldenver.org
Amelia Challender, University of Colorado, Denver, amelia.challender@ucdenver.edu
Kelci Price, University of Colorado, Denver, kelci.price@ucdenver.edu
Abstract: Goodwill Denver is a nonprofit organization providing a multitude of community services to youth and disabled/disadvantaged citizens. The organization had a history of collecting a wide array of accountability data. To increase the organization's ability to identify those key outcomes most central to their mission, evaluators worked collaboratively with staff members to develop a series of tiered logic models to describe the inputs and outcomes of each distinct program and organizational unit. Finally, a 'super' logic model was synthesized to describe essential outcomes across all services. Evaluators will share the benefits and challenges of using this process for conducting a comprehensive program evaluation. A representative from Goodwill Denver will share the value of the resulting logic models for organizational learning.
Raising the Level of Evidence: How to Develop Evidence-Based Practices Within Local Non-Profits
Kelly Hill, Nexus Research Group, khill@nexusresearchonline.com
Dana Rickman, Georgia Partnership for Excellence in Education, drickman@gpee.org
Janelle Williams, The Center for Working Families Inc, jwilliams@tcwfi.org
Abstract: The importance of evidence-based practice in developing and promoting responsible social interventions is well documented. What we know less about however, is the means by which an organization moves from "promising" or "best" practice to evidence based practice. What are the key steps as well as real barriers and challenges? In this session, we offer a framework for moving a group towards an evidence based practice using the example of The Center for Working Families, Inc. (TCWFI). TCWFI is a community-based non-profit which provides bundled services to individuals and families living within five contiguous neighborhoods in Atlanta, Georgia. To better service their participants, the organization has engaged an extensive process evaluation - revising core processes and updating its evaluation infrastructure. By exploring this example, we hope to provide some practical advice on how evaluators might be more effective in helping community-based groups raise the empirical bar.
Advancing Social Change Through Evaluation: The Value of Evaluation in the Nonprofit Sector
Joanna Klak, Independent Consultant, klakjoanna@wp.pl
Abstract: Although the value of evaluating projects, programs, and policies long has been understood, its usefulness has not been fully discovered by the nonprofit organizations especially in developing countries. A research study conducted among nonprofit leaders in Poland revealed that, there is limited capacity for, and understanding of, evaluation within staff members in nonprofit organizations. In addition, evaluation is seldom used by donor agencies or recipient organizations to improve program effectiveness. This raises ethical concerns about the efficient use of donated resources, and the effects on the intended beneficiaries of aid. An urgent need is signaled for professional evaluators to disseminate their knowledge about evaluation in the nonprofit world. The author concludes with a reflection on the professional evaluators' potential to challenge the status quo, and the evaluators' role in bringing social change.

In a 90 minute Roundtable session, the first rotation uses the first 45 minutes and the second rotation uses the last 45 minutes.
Roundtable Rotation I: Coalition Evaluation: The Power of Coalitions to Change a Cultural or Organizational Norm
Roundtable Presentation 905 to be held in Conference Room 1 on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Collaborative, Participatory & Empowerment Evaluation TIG
Lesli Johnson, Ohio University, johnsol2@ohio.edu
Abstract: Using coalitions to address complex social problems has become a recognized strategy for change. We explore the importance of coalition evaluation reviewing three initiative evaluations that employed coalitions as vehicles to change cultural or organizational norms. The relationship between individual coalition characteristics including coalition effectiveness and the ability of multiple coalitions to achieve a change in a cultural or organizational norm is discussed. A statewide tobacco use prevention initiative, using community coalitions to change the social acceptability of the use of tobacco products; a statewide initiative, changing the organizational norm of local Adult Basic Literacy Education (ABLE) programs towards encouraging their students to pursue education and training beyond the high school equivalency exam; and a foundation initiative, promoting school-based childhood wellness and obesity prevention through the development of school wellness councils are reviewed in terms of their effectiveness to alter a community or organizational norm.
Roundtable Rotation II: 'Trickle Down' or 'Bubble Up': Using Evaluation to Build a Useful Model for Implementing a Policy of Collaboration
Roundtable Presentation 905 to be held in Conference Room 1 on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Collaborative, Participatory & Empowerment Evaluation TIG
Rachael Lawrence, University of Massachusetts, Amherst, rachaellawrence@ymail.com
Sharon Rallis, University of Massachusetts, Amherst, sharonrallis@earthlink.net
Abstract: Evaluating collaboration between education providers and mental health workers serving institutionalized youth proved challenging due to complexity in and power differentials across the system. First, two state agencies mandated policy that all providers and levels collaborate. Second, these agencies contracted multiple service providers who hired site level practitioners. Policy was expected to 'trickle down' through at least three levels to program operations. Further, each agency and level interpreted collaboration differently. What collaboration meant - or if collaboration existed - was ultimately defined by varying practices at sites. As external evaluators, we quickly realized that simply measuring collaboration was not feasible, nor would it produce useful information to either program leadership or practitioners. Instead, we refocused to document practices in operation from which we generated a theory-based model that proved useful to understand and support a policy 'bubbling up' from practice. This roundtable will discuss how practitioners and leadership used our model.

Roundtable: Evaluation as a Methodology for Understanding and Enabling Interdisciplinary Team Science
Roundtable Presentation 906 to be held in Conference Room 12 on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Research, Technology, and Development Evaluation TIG
Deana Pennington, University of Texas at El Paso, ddpennington@utep.edu
Allison Titcomb, ALTA Consulting LLC, altaconsulting@cox.net
Marcia Nation, Arizona State University, marcia.nation@asu.edu
Abstract: In recent years, there has been increasing emphasis in science and technology research on interdisciplinary team efforts, particularly in health, environmental, and global change research. These complex team efforts are fraught with difficulties that have been identified through numerous case studies. In this Roundtable we will discuss opportunities for integrating Theory-Driven Developmental Evaluation methods with Design-Based Research methods to inform our understanding of interdisciplinary research teams, assess team effectiveness, and improve team outcomes. Coryn et al (2010) A Systematic Review of Theory-Driven Evaluation Practice From 1990 to 2009. Am. J. of Eval, Online URL: http://aje.sagepub.com/content/early/2010/11/11/1098214010389321 Sandoval (2004). Developing Learning Theory by Refining Conjectures Embodied in Educational Designs',Ed. Psych. 39:4,213-223. Stokols et al (2008). The Ecology of Team Science. Amer. J. Prev. Med. 35(2S):S96-S115.

Session Title: Building Evaluation Capacity in International Development Programs: Theory and Practice
Multipaper Session 907 to be held in Conference Room 13 on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the International and Cross-cultural Evaluation TIG
Maby Palmisano,  ACDI/VOCA, mpalmisano@acdivoca.org
Evaluating Capacity Development in International Programs: Status, Challenges, and Future Prospects
Douglas Horton, Independent consultant, d.horton@mac.com
Abstract: Capacity development is a common goal of international development programs. There are frequent calls to evaluate capacity development, in order to gauge the results (returns to investment) and learn lessons that can be used to improve future programs. However, few capacity development interventions have been systematically evaluated to date. Based on a review of publications and materials available on the Internet and on the author's own development work, this paper assesses the current state of knowledge and practice in this field. While there has been rapid growth in the number of guidelines and methods for evaluating capacity development, few capacity development interventions appear to have been systematically evaluated to date. The paper explores why this is true and what might be done to remedy the situation. In addition to daunting conceptual and methodological challenges, perhaps the most important barriers to evaluating capacity development are of a political nature.
Achieving Evidence-based Policy Through Evaluation Capacity Development in Central America
Stefanie Krapp, German International Cooperation, stefanie.krapp@giz.de
Abstract: Since the 90s Costa Rica has successfully anchored institutional mechanisms of monitoring and evaluation (M&E). Other countries in Central America demonstrate interest in strengthening their M&E capacities as evidenced through the implementation of variations of M&E systems, but they are very limited in their methodological and institutional development and sectoral coverage. By means of the new program „Evaluation Capacity Development (ECD)' the German Ministry of Economic Cooperation and Development (BMZ) will support the Costarican efforts in strengthening the institutional evaluation capacities. This will help establish a culture capable of reporting evaluation impact within their Results Based Management System. Other Central American countries will be included in the common learning process by concerted regional ECD initiatives. The presentation will begin with an overview of the state of M&E in Central America. It will then introduce the general concept of the ECD program and show how the ECD will be integrated in Central America. Finally, the presentation will highlight how the ECD measures will directly and indirectly contribute to achieving evidence-based policy in the region.
Women's Global Connection: Making Values Explicit in Capacity Building and Evaluation Efforts for Sustainable Development
Dorothy Ettling, University of the Incarnate Word, ettling@uiwtx.edu„
Ada Gonzalez, University of the Incarnate Word, aagonza1@student.uiwtx.edu
Abstract: As Women's Global Connection forms partnerships with women's groups in Africa and South America focused on economic and educational development, it constantly reflects on the socio-political, economic, and cultural diversity of the stakeholder's values, interests, and skills. It is important to preserve the position of the stakeholders within the planning, implementation and evaluation components of projects. Using a women's empowerment model (Ettling, Caffer and Buck, 2010) as a theoretical framework and Community-Based Participatory Research (Minkler, M. & Wallerstein, 2008) as a research approach, the researchers and women's groups organically identify local issues and the socio-cultural contexts of the women's work while co-creating the curriculum of capacity-building efforts, and the level of commitment to and use of evaluation data. This paper explores how an evaluation model, based on negotiated terms and values, is implemented when the ultimate goal is poverty elimination.
Evaluative thinking: The Missing Link in Evaluation Capacity Building in the South
Sonal Zaveri, Community of Evaluators for South Asia, sonalzaveri@gmail.com
Abstract: Not addressing evaluative thinking of beneficiaries and implementers in theory or practice indicates a significant gap in South evaluation capacity building, missing the opportunities for dialogue with beneficiaries (1) to reflect deeply about what worked, what did not and why. Using Asian case studies, the paper describes how a lack of evaluative thinking made the practice of utilization focused evaluation(2) difficult, how challenges were overcome and how increased opportunities for evaluative thinking (in addition to methods) led to increased use of findings, sometimes in unintended ways. The paper suggests that theorists, evaluators and commissioners of evaluation mostly based in the North, need to promote and value South evaluative especially with constrained resources to build 'real' capacities and experience meaningful change. Ramirez, R (2008) A mediation on meaningful participation The Journal of Community Informatics, Vol 4, No 3 Smith, M. F (1999). Participatory evaluation: Not working or not tested?American Journal of Evaluation, 20(2),295 Patton, M (2008)Utilization Focused Evaluation, SAGE
The Evaluator in Evaluation Capacity Building: Three Scenarios from Three Regions in the International Context
Hubert Paulmer, Harry Cummings and Associates Inc, hubertpaulmer@gmail.com
Abstract: The paper highlights the key role evaluators can play in Evaluation Capacity Building (ECB) and presents recent experience from integrating ECB into evaluations. The paper looks at the ECB concept and the different methods. It shares the approaches and processes used by the evaluator in ECB at various levels, in three different projects of three different organizations in South Asia, Africa and Eurasia. The three projects were from different sectors (education, health, and water and sanitation). The paper presents the 'ECB experience' of how it was designed and carried out in each of the three cases, in addition to the information on stakeholders who benefited and how. The paper also looks at the external and internal environment that was required to make ECB possible in these three cases. The paper also highlights how ECB organically implies a collaborative / participatory approach to evaluation.

Session Title: Learning From Community Evaluations: Theory and Practice
Multipaper Session 908 to be held in Conference Room 14 on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Collaborative, Participatory & Empowerment Evaluation TIG
Connie Walker,  University of South Florida, cwalkerpr@yahoo.com
Understanding & Evaluating Participation in Community-based Social Change Organizations: A Conceptual Guide to the Participatory Evaluation Process
Justina Grayman, New York University, justina.grayman@nyu.edu
Abstract: This paper seeks to answer the question: How can evaluators and community-based organizations evaluate barriers and supports to community members participating in their social change efforts? First, I present a conceptual model of factors at the individual, organizational, and community levels that influence organizational participation based on empirical and theoretical literature within community psychology. Second, I present a model for the pre-evaluation process that describes the process through which researchers and community-based organizations can efficiently decide upon meaningful domains of focus for studies of community participation. Suggestions are presented for methods of evaluating this process, including analysis of meeting minutes and follow-up interviews with process participants. Third, I illustrate the applications of the conceptual model and process of participatory evaluation by describing a case study of an evaluation effort involving a community-based social change organization in New York City. Limitations of the model are addressed.
The Promise and Prospect of Participatory Evaluation Approaches in Family-Centered Paediatric Healthcare Settings: The Results of a Mixed Methods Study
Katherine Moreau, University of Ottawa, kmoreau@cheo.on.ca
J Bradley Cousins, University of Ottawa, bcousins@uottawa.ca
Abstract: Paediatric healthcare settings have transitioned from a medically focused to family-centered model of programming. Often described as family-centered care (FCC), this model recognizes that each family is unique; that parents know their children best, and that optimal child functioning occurs within a supportive family context. It advocates that families are consumers whose needs, priorities, and opinions should be respected, and that family engagement should be encouraged in all program aspects including evaluation. However, many healthcare professionals and evaluators working in family-centered settings, fail to select evaluation approaches that promote active family engagement. This paper describes the results of a mixed methods study that examined the strengths, limitations, and consequences of the current evaluation approaches used in these settings. It illustrates the promise and prospects of participatory evaluation approaches that, in theory, are compatible with the philosophy of FCC and allow for family engagement.
Participatory Action Evaluation: A Practical, Concept-driven Administrative Tool and Community Capacity Building Strategy
Cindy Banyai, Refocus Institute, refocusinstitute@gmail.com
Abstract: Participatory action evaluation (PAE) is a type of concept-driven participatory action research and is intended to provoke thought and discussion among its participants, thus building community capacity, as well as generating a wealth of information useful to researchers and decision-makers alike. This paper describes a pioneer PAE case focusing on participatory video conducted in Pagudpud, Philippines. The findings reveal that participatory action evaluation has the dual function of providing information for policy-making, and community capacity building by empowering people through information dissemination, critical community discussion, and leadership development. This work adds to the dialogue on action science, evaluation, and participatory methods.
The Evaluation of the Milwaukee Community Literacy Project: A Case Example for Integrating Summative and Participatory Evaluation Approaches
Rachel Lander, University of Wisconsin, Madison, rlander@wisc.edu
Curtis Jones, University of Wisconsin, Madison, cjjones5@wisc.edu
Robert Meyer, University of Wisconsin, Madison, rhmeyer@wisc.edu
Abstract: In this paper, we present our evaluation of the Milwaukee Community Literacy Project (MCLP). The Boys & Girls Clubs of Greater Milwaukee (BGCGM) was awarded an Investing in Innovation award from the Department of Education for the MCLP, a collaboration between BGCGM and the Milwaukee Public Schools (MPS) designed to improve the literacy development K-3rd grade students in seven schools through one-on-one tutoring, after-school programming, and parents/family support. First, we present the program logic model and the methods we are using to document inputs, outputs, and outcomes. We then discuss our reasoning for, and the implications of, our decision to apply a participatory action research (PAR) approach to our evaluation. Finally, we discuss the specific strategies being employed in the PAR to document the collaboration between MPS and BGCGM, and the results of these efforts. We conclude with a discussion of the implications of using PAR in high-stakes evaluations.
The Partnership Development Rubric: An Innovative Assessment Tool to Support Community-Campus Partnerships and Address a Critical Need in Community-based Participatory Research (CBPR)
Kathryn Nearing, University of Colorado, Denver, kathryn.nearing@ucdenver.edu
Abstract: The Partnership of Academicians and Communities for Translation (PACT) Council is the governing body of the Community Engagement program - a major component of the NIH-funded Colorado Clinical and Translational Sciences Institute (CCTSI). CCTSI evaluators, in collaboration with community liaisons, developed a rubric to assess the evolution of this community-campus partnership. The rubric explores the formation (4 items) and functioning (3 items) of partnerships, and the establishment of a sense of cultural safety and humility (3 items). Each item is presented as a continuum - a series of descriptions that represent the deepening of a partnership. For each item, respondents are asked to determine which description best represents how they currently experience a partnership. Space is provided to capture insights and reflections. This paper features the rubric, the theoretical underpinnings, and early lessons learned from the initial utilization and subsequent adaptation of the tool to support evaluating other large-scale collaborative initiatives.

Session Title: Developing Effective Recommendations: From Theory to Action
Multipaper Session 909 to be held in Avila A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Evaluation Use TIG
Edward McLain,  University of Alaska, Anchorage, afeam1@uaa.alaska.edu
Edward McLain,  University of Alaska, Anchorage, afeam1@uaa.alaska.edu
Linking the Production of Evaluation Knowledge to the Context of Its Application
Thomas Schwandt, University of Illinois at Urbana-Champaign, tschwand@illinois.edu
Natasha Jankowski, University of Illinois at Urbana-Champaign, njankow2@illinois.edu
Abstract: The use of evaluation findings is a topic that has generated multiple definitions of use, as well as types and taxonomies of use. However, the literature is relatively silent on the fact that 'use' rests on conceptualizations of how the context of the production of evaluation information and evidence is related to the context of the application of that information and evidence. This paper presents and appraises three ways in which this relationship has been conceptualized—dissemination/transmission, translation, and interaction.
Users Perspectives of the Factors that Influence whether Recommendations get Implemented
Shevaun Nadin, Carleton University, snadin@connect.carleton.ca
Bernadette Campbell, Carleton University, bernadette_campbell@carleton.ca
Abstract: As an intended step towards social betterment, evaluators often provide recommendations for program improvement. However, recommendations promote social betterment only insofar as they are actually used, and unfortunately they are often ignored. Despite the vast literature on evaluation use, it is still not clear which factors most strongly influence whether evaluation findings get used. Missing from the literature is an empirical examination of the relative importance of, and users opinions about the factors that importantly affect instrumental use. To address that gap, evaluation users' opinions about the relative importance of factors that facilitate recommendation implementation were explored. Using Q methodology, various statements were drawn from the evaluation literature about what importantly influences instrumental use, and a sample of evaluation users rank ordered those statements. The emergent perspectives are presented, and the practical and theoretical implications of the findings are discussed.
Six Steps to Effective Recommendations
Michael Hendricks, Independent Consultant, mikehendri@aol.com
Abstract: Recommendations are one of the most important, yet also one of the least-addressed aspects of evaluations. More than a few evaluators and evaluation managers seem to think that recommendations flow naturally from the findings of an evaluation, even if we give them only a few hours' worth of thought at the very end of our work. As a result, too often our recommendations don't flow from our findings, aren't feasible, wouldn't solve the problems being addressed, or aren't presented in a way to gain acceptance by key stakeholders. The presentation will describe three separate phases of the process: (1) developing effective recommendations, from the first inkling of an idea to the final wordsmithing, (2) presenting effective recommendations, ideally in various different ways, and (3) following-up effectively on recommendations, whether they've been accepted, rejected, or deferred. Within these three phases, I will present six specific steps to effective recommendations.

Session Title: Systems Thinking Evaluation Tools and Approaches for Measuring System Change
Multipaper Session 910 to be held in Avila B on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Systems in Evaluation TIG
Mary McEathron,  University of Minnesota, mceat001@umn.edu
Effective Incorporation of Systems Thinking in Evaluation Practice: An Integrative Framework
Kanika Arora, Syracuse University, arora.kanika@gmail.com
William Trochim, Cornell University, wmt1@cornell.edu
Abstract: The support for systems thinking in evaluating complex social programs has increased markedly in recent years. However, despite there being a general agreement on the holistic advantage provided by the systems approach, the practical implementation of systems thinking remains challenging. For many practitioners, the most effective way to integrate systemic thinking in conventional evaluation activities is still ambiguous. In response, this paper aims to develop an integrative theoretical framework that connects key elements in the systems paradigm to foundational concepts in the evaluation literature. Specifically, an attempt is made to link systemic ideas of inter-relationships, perspectives and boundaries to Scriven's classification of evaluation types and to Campbell's validity categories. With this framework, we can begin to explore strategies for enhanced implementation of systems thinking in mainstream evaluation practice.
Keeping Track in Complicated or Complex Situations: The Process Monitoring of Impacts Approach
Richard Hummelbrunner, OEAR Regionalberatung, hummelbrunner@oear.at
Abstract: This monitoring approach systematically observes those processes, which are expected to lead to results or impacts of an intervention. It builds on the assumption that inputs (as well as outputs) have to be used by someone to produce desired effects. A set of hypotheses are identified on the desired use of inputs or outputs by various actors (e.g. partners, project owners, target groups), which are considered decisive for the achievement of effects. These hypotheses are incorporated in logic models as statements for 'intended use', and these assumptions are monitored during implementation - whether they remain valid, actually take place - or should be amended (e.g. to capture new developments or unintended effects). The paper describes the approach as well as the experience gained in Austria and beyond, in particular applications for monitoring programmes, to provide an adequate understanding of their performance under more complex and dynamic implementing conditions.
The Development and Validation of Rubrics for Measuring Evaluation Plan, Logic Model, and Pathway Model Quality
Jennifer Urban, Montclair State University, urbanj@mail.montclair.edu
Marissa Burgermaster, Montclair State University, burgermastm1@mail.montclair.edu
Thomas Archibald, Cornell University, tga4@cornell.edu
Monica Hargraves, Cornell University, mjh51@cornell.edu
Jane Buckley, Cornell University, janecameronbuckley@gmail.com
Claire Hebbard, Cornell University, cer17@cornell.edu
William Trochim, Cornell University, wmt1@cornell.edu
Abstract: A notable challenge in evaluation, particularly systems evaluation, is finding concrete ways to capture and assess quality in program logic models and evaluation plans. This paper describes how evaluation quality is measured quantitatively using logic model and evaluation plan rubrics. Both rubrics are paper and pencil instruments assessing multiple dimensions of logic models (35 items) and evaluation plans (73 items) on a five point scale from one to five. Although the rubrics were designed specifically for use with a systems perspective on evaluation plan quality they can potentially be utilized to assess the quality of any logic model and evaluation plan. This paper focuses on the development and validation of the rubrics and will include a discussion of inter-rater reliability, the factor analytic structure of the rubrics, and scoring procedures. The potential use of these rubrics to assess quality in the context of systems evaluation approaches will also be discussed.
Stories and Statistics with SenseMaker: New Kid on the Evaluative Block
Irene Guijt, Learning by Design, iguijt@learningbydesign.org
Dave Snowden, Cognitive Edge, dave.snowden@cognitive-edge.com
Abstract: SenseMaker, developed by Dave Snowden, is an innovative newcomer to evaluative practice, with experiments in 2010 and 2011 pioneering its application in international development. This paper draws on examples in Kenya (community development) and Ghana/Uganda/global policy (water services) to illustrate how several persistent dilemmas in the evaluation profession can be overcome. SenseMaker helps organizations: focus on shifting impact patterns as perceived by different perspectives generate databases, people's life libraries, that allow SenseMaker if facilitated and linked to decision makers - creating evidence-based policy generate rolling baselines to continually update evidence base enable cross-silo thinking (including cross-organisational) and overcome narrow understanding of attribution of efforts seek surprise explicitly rather than viewing people's lives through our own concepts more grounded and diverse feedback to donors, thus more local autonomy generating actionable insights, based on very concrete needs, via peer to peer knowledge management.

In a 90 minute Roundtable session, the first rotation uses the first 45 minutes and the second rotation uses the last 45 minutes.
Roundtable Rotation I: Addressing Cultural Validity of Measurement and Evaluation Among Immigrant Youth for the Implementation of Program Development
Roundtable Presentation 911 to be held in Balboa A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Multiethnic Issues in Evaluation TIG
Nida Rinthapol, University of California, Santa Barbara, rinthapol@gmail.com
Edwin Hunt, University of California, ehunt@education.ucsb.edu
Richard Duran, University of California, duran@education.ucsb.edu
Abstract: The focus of this study is the analysis and evaluation of validity in goal orientation (GO) measurement among secondary school students from low-income immigrant families participating in a college preparation program in Santa Barbara, CA schools. The notion of culturally appropriate measurement is crucial in the context of program evaluation. The study will incorporate psychometric methods to examine the validity and reliability of the GO measure called Pattern of Adaptive Learning Survey (PALS) among immigrant youth. The verification of cultural validity in GO measurement helps us learn about how students participating in the program learn and process information and how we can further improve the program by tailoring it to the needs of students from historically underrepresented groups, and enhance their access to higher education.
Roundtable Rotation II: Evaluating the Implementation of a Culturally-based Intervention in Hawaii
Roundtable Presentation 911 to be held in Balboa A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Multiethnic Issues in Evaluation TIG
Sarah Yuan, University of Hawaii, sarah.yuan@hawaii.edu
Mei-Chih Lai, University of Hawaii, meichih@hawaii.edu
Karen Heusel, University of Hawaii, kheusel@hawaii.edu
Abstract: The Hawaii Department of Health supported Evidence-Based Programs to address underage drinking at the local level. This study focuses on Project Venture which is a youth program demonstrated to decrease drinking among American Indian adolescents. Cultural adaptations were integrated in the program, including translating the curriculum into Hawaiian language, to tailor to the diverse cultures in Hawaii. This study is the first empirical evaluation of a culturally based prevention intervention to reduce underage drinking in Hawaii. The populations served, participants' characteristics, program design and adaptations, and participants' experience and satisfaction were examined through in-depth interviews, focus groups, and program surveys. Program outcomes were analyzed using data from pre-and-post surveys of participants. A repeated measure GLM analyzed 1) program effectiveness, 2) outcome differences among settings, and 3) factors for successful program implementation. The lessons learned are provided to assist future intervention practices in multiethnic communities.

Session Title: Pitfalls in Reporting Services and Coverage to a Donor Community Hungry for Positive Results
Think Tank Session 912 to be held in Balboa C on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Non-profit and Foundations Evaluation TIG
Dale Hill, American Red Cross, hilldal@usa.redcross.org
Dale Hill, American Red Cross, hilldal@usa.redcross.org
Scott Chaplowe, International Federation of Red Cross and Red Crescent Societies, scott.chaplowe@ifrc.org
Gregg Friedman, American Red Cross, friedmang@usa.redcross.org
Abstract: Over the last decade, the international humanitarian and development community has given increasing attention to accountability. Program managers have called on analysts to focus on design and measurement of impact indicators and higher level outcomes --while assuming data collection on "lower-level indicators" for outputs and coverage is more straightforward. However, large umbrella organizations face special challenges in guiding both data collection and reporting for key outputs, such as counting those reached. Issues such as double counting, and distinguishing between direct and indirect recipients of services arise, particularly when organizations or branches operate in multiple sectors over different time periods in overlapping locations. This think tank will present the challenges experienced by the Red Cross Movement in reporting on programming in complex emergency settings (i.e. Tsunami and Haiti response), as well as longer term recovery and development interventions, inviting participants to share their own lessons learned with reporting and aggregation.

Session Title: Transferring Evaluation Experience Across Program Contexts: Discursive Evaluation With two National Science Foundation ITEST Programs, Carnegie Mellon University's ACTIVATE and ALICE
Multipaper Session 913 to be held in Capistrano A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Organizational Learning and Evaluation Capacity Building
Cynthia Tananis, University of Pittsburgh, tananis@pitt.edu
Abstract: School reform programs are often faced with an atmosphere of diversity and complexity where communication and organizational learning are increasingly difficult. As evaluators, our challenge is to create common ground from which to speak about program inputs, implementation, and goals. We have found that discursive evaluation strengthens working relationships with clients and allows for transferring and leveraging knowledge across projects. This panel focuses on discursive evaluation with two National Science Foundation ITEST funded programs of Carnegie Mellon University's computer science department. Our discursive relationship with the program teams has increased evaluation capacity across both projects, both in content expertise and effective evaluation practices in computing education. By investing in the relationships with our clients, not only are the common educational endeavors strengthened but so too are individual capacities. In this paper, we describe the strengths and challenges that accompany a discursive evaluation approach.
The Role of the Activate Workshops in Teachers' Professional Growth and Student Learning: Measuring the Effectiveness of Teachers' Professional Development in Computer Science Within a K-12 Education Context
Yuanyuan Wang, University of Pittsburgh, yuw21@pitt.edu
The ACTIVATE year 1 evaluation included summer workshop surveys, follow-up survey, and follow-up interviews. The instruments attempted to conduct an integrated series of evaluation activities for K-12 teacher professional development in computer science. The summer workshop surveys consisted of a baseline survey for all participants, and pre-post workshop surveys and post-workshop skills assessment for each of the three workshops (Alice, Computational Thinking, and Java). The follow-up survey and interviews focused on teachers' implementation of workshop materials/activities and their impact on students' interests in their computer science course and future computer-related careers. Findings indicated that the goals of the ACTIVATE are substantially achieved. Specifically, teachers made good use of the workshop materials/activities in their classroom. Teachers strengthened their content knowledge and skills in computer science after their participation in the workshop(s), and this strengthened knowledge-base benefited student learning and contributed to students' increased interests in computer-related careers.
Go Ask Alice: Faculty Mentoring and the Implementation of Alice 3.0 in Community College Contexts
Keith Trahan, University of Pittsburgh, kwt2@pitt.edu
Cara Ciminillo, University of Pittsburgh, ciminill@pitt.edu
Community college faculties are notoriously disconnected. Thus, reform programs designed to change the way faculty teach and students learn must find a way to gain traction. For CMU's Alice program, the solution was the combination of a faculty mentor network and student interest in graphics, animation, and storytelling. The ALICE year 1 evaluation consisted of baseline and end of course surveys administered to the students of participating faculty at community colleges in New Jersey, Texas, and Pennsylvania. Courses in which ALICE was implemented included introductory, general, and advanced computer programming courses. At the end of the year interviews with participating faculty and key program personnel were conducted to collect information on both the implementation of instructional practices and the experience in the ALICE mentoring network. The focus of the ALICE evaluation was threefold: student and faculty experience in courses utilizing Alice and faculty perspectives on the Alice mentoring network.
Discursive Evaluation: A Process of Capacity Building for Both Evaluators and Program Leaders
Cara Ciminillo, University of Pittsburgh, ciminill@pitt.edu
Keith Trahan, University of Pittsburgh, kwt2@pitt.edu
Having a discursive relationship with the program teams has increased evaluation capacity across both the ACTIVATE and ALICE projects, in both computer programming content expertise and effective evaluation practices in computing education. As well, transfer and application of knowledge has helped to inform future funding proposals and in fact has already helped to inform the design of one of our newest projects, Duke Scale Up. By investing in the relationships with our clients, not only are the common educational endeavors strengthened but so too are individual capacities. In this paper, we describe the strengths and challenges that accompany a discursive evaluation.

Session Title: Establishing Best Practices for Public Health emergency Preparedness and Response Evaluation
Multipaper Session 914 to be held in Capistrano B on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Disaster and Emergency Management Evaluation
Brandi Gilbert,  University of Colorado at Boulder, brandi.gilbert@colorado.edu
Defining Context-Sensitive Evaluative Criteria: Lessons From the Development of National Public Health Preparedness Standards
Christopher Nelson, RAND Corporation, ffcarp123@yahoo.com
Abstract: Recent legislation requires that federal public health preparedness funding to states is linked to clear performance measures and standards. Yet, the variation in risk profiles, community characteristics, and governance structures across the nation's 2,600 health departments means that the standards must strike a balance between the simplicity associated with national uniformity and the flexibility needed to ensure that the standards are not counterproductive in some communities. This paper describes lessons learned from a project to develop national standards (a form of evaluative criteria) on communities' ability to deliver lifesaving medications to their populations in response to a bio-terrorist attack, disease outbreak, or other disaster. The paper provides a taxonomy of models for evaluative criteria that balance standardization and flexibility and describes some of the sources of evidence used to develop the criteria, including a survey of community practices, mathematical modeling of disaster scenarios, and expert elicitation.
Measuring Community Resilience Within Public Health Emergency Preparedness and Response: Complexities and Lessons Learned
Karen Kun, Centers for Disease Control and Prevention, icn3@cdc.gov
Dale Rose, Centers for Disease Control and Prevention, ido8@cdc.gov
Thomas Morris, Centers for Disease Control and Prevention, tom8@cdc.gov
Monique Salter, Centers for Disease Control and Prevention, hjf2@cdc.gov
Tamara Lamia, ICF Macro, tlamia@icfi.com
Amee Bhalakia, ICF Macro, abhalakia@icfi.com
Anita McLees, Centers for Disease Control and Prevention, amclees@cdc.gov
Abstract: The Centers for Disease Control and Prevention (CDC) manages funds appropriated by Congress for public health preparedness and response activities. Since 2001, CDC has provided support through the Public Health Emergency Preparedness (PHEP) Cooperative Agreement to 62 state, territorial and local public health agencies to build their infrastructure and capabilities in preparedness and response. A new five-year funding cycle begins in 2011 that requires PHEP awardees to demonstrate 15 preparedness capabilities. Community resilience (i.e. community preparedness and community recovery) represents two of these capabilities. This presentation will briefly describe the methodology used for developing community resilience performance measures, and the relevance of these measures for purposes of program accountability and improvement. The presentation will also focus on the complexities inherent in measuring community resilience, particularly within the context of public health emergency preparedness and response activities, and the lessons learned during the measurement development process.
Performance Measurement of Laboratory, Public Health Surveillance and Epidemiological Investigation Capabilities Within the Context of Public Health Emergency Preparedness: A Pilot Project
Salter Monique, Centers for Disease Control and Prevention, msalter@cdc.gov
Rupesh Naik, Centers for Disease Control and Prevention, rnaik@cdc.gov
Dale Rose, Centers for Disease Control and Prevention, ido8@cdc.gov
Bushong Erica, Centers for Disease Control and Prevention, ebushong@cdc.gov
Rose Dale, Centers for Disease Control and Prevention, drose@cdc.gov
Jacqueline Avery, Centers for Disease Control and Prevention, hjn9@cdc.gov
Erica Bushong, Centers for Disease Control and Prevention, goj8@cdc.gov
DeAndrea Martinez, Centers for Disease Control and Prevention, hez0@cdc.gov
Anita McLees, Centers for Disease Control and Prevention, amclees@cdc.gov
Thomas Morris, Centers for Disease Control and Prevention, tom8@cdc.gov
Karen Mumford, Centers for Disease Control and Prevention, kmumford@cdc.gov
Abstract: The Pandemic and All-Hazards Preparedness Act of 2006, requires the development and application of evidence-based benchmarks and objective standards that measure levels of preparedness. The National Health Security Strategy, released in December 2009, identifies those priority areas for practice and measurement in public health (PH) preparedness and response, which includes Biosurveillance. In order for state and local PH jurisdictions to be able to demonstrate Biosurveillance (i.e. PH surveillance, epidemiological investigation, and PH laboratory testing) capabilities, the Centers for Disease Control and Prevention (CDC) in collaboration with federal, state, and local PH partners developed a set of formative measures that would capture performance in these areas. This presentation will briefly describe the design and methodology of a pilot study utilized to test this draft set of Biosurveillance performance measures. Additionally, there will be a summary of key findings and recommendations for implementing these measures across a wide assortment of PH jurisdictions.

Session Title: Identifying and Using Evidence-Based Practices
Multipaper Session 915 to be held in Carmel on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Alcohol, Drug Abuse, and Mental Health TIG
Gintanjali Shrestha,  Washington State University, gintanjali.shrestha@email.wsu.edu
Propositional If-Then Statements in the Qualitative Analysis of Implementation Fidelity
Oliver Tom Massey, University of South Florida, massey@usf.edu
Nicole Deschenes, University of South Florida, ndeschenes@usf.edu
Kathleen Ferreira, University of South Florida, ferreira@usf.edu
Abstract: In the last decade behavioral health researchers and practitioners have come to recognize the critical importance of the use of service interventions that have established evidence of their efficacy. To be effective, these evidence based practices (EBPs) must be implemented in service settings in ways that maintain the integrity of the principles, procedures, and standards under which they were developed. The assessment of implementation fidelity thus becomes a critical element in ensuring that programs will work. The current study describes the use of propositional if-then tests to determine the degree to which fidelity to the program model is expressed in qualitative interviews and focus groups. Key program informants include administrators, staff, and service consumers in the seven national Healthy Transitions Initiative programs (HTI) that serve youth and young adults as they transition from the children's mental health service system to adulthood.
Critical Review of Evidence-Based Program Registers for Behavioral Health Treatment
Stephen Magura, Western Michigan University, stephen.magura@wmich.edu
Daniela Schroeter, Western Michigan University, daniela.schroeter@wmich.edu
Chris Coryn, Western Michigan University, chris.coryn@wmich.edu
Abstract: The continuing search for 'what works' is very important in addiction and mental health treatment. Evidence-based programs registers (EBPRs) are being created to aid in decisions about treatment adoption. There are more than 40 such web-based registers, which promote interventions as 'evidence-based,''effective,' 'best practices,' or 'promising practices.' Familiar examples are the National Registry of Evidence-Based Programs and Practices, the CDC's Diffusion of Effective Behavioral Interventions website and the Cochrane Reviews. Unfortunately, the proliferation of EBPRs has been accompanied by often dramatically different purposes, criteria for inclusion, definitions of acceptable 'evidence,' and standards for designating interventions as 'effective.' The presentation will describe the plan for an upcoming critical review of EBPRs expected to be funded by the National Institute on Drug Abuse. This will investigate the similarities and differences among EBPRs and determine the practical consequences of using different types and standards of evidence for certifying given treatments as 'effective.'
Using a Deliberative Democratic Evaluation Tool to Evaluate, Educate, and Prevent Border Violence: Contextually Rich Service to Science Study
Terrence Tutchings, O Z White Associates LLC, ttutchings@gmail.com
Sandra Eames, Austin Community College, seames@austin.rr.com
Martin Arocena, O Z White Associates LLC, martin.arocena@gmail.com
Abstract: Narco-terrorism, Hurricanes Ike, Dolly and Alex, and relentless media attention to immigration issues form the backdrop for a process evaluation and feasibility assessment of the Border Violence Prevention Task Force community mobilization efforts in 2010. With funding provided by the Texas Department of State Health Services the Task Force developed seven focused curriculum modules based on the needs of community members, promotoras, and other service providers to identify and implement evidence-based responses to violence and natural disasters in the Mexico-U.S. border region. This paper presents the need for the curriculum modules, the development of evidence-based content from the fields of criminology, education, substance abuse, mental health and disaster preparation, results from participant feedback testing feasibility of the approach, and where the project now stands as a Service-to-Science candidate. Stufflebeam's CIPP Model (2000) and House and Howe's (2000) Deliberative Democratic Evaluation Checklist provided the framework for the evaluation and feasibility assessment.

Session Title: Evaluating Emerging Educational Technologies in K-12 and Higher Education
Multipaper Session 916 to be held in Coronado on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Distance Ed. & Other Educational Technologies TIG
S Marshall Perry, Dowling College, perrysm@dowling.edu
Abstract: This multipaper session examines the relative contributions that emerging educational technologies such as online instruction and web-based assessment tools might have upon student academic growth, teacher practice, and assessment. Researchers will describe methodological challenges, promising research designs, the creation of quantifiable indicators, and the transition process towards effectively leveraging emerging technologies. Paper presentations will also discuss findings from a current evaluation of a K-12 online instructional program and another study of a college's transition to online assessment and data management. We believe that this research should useful to educational service providers, K-12 school systems, higher education officers, and policymakers. While the findings of the studies themselves are of interest, researchers hope that the session encourages the broad participation of attendees in a discussion of methodological and ethical implications of evaluating educational technologies generally.
Evaluating Effective Technology Enhanced Instruction Using the International Society of Technology Education: National Educational Technology Standards
Maria Esposito, Molloy College, mesposito@molloy.edu
Maria Esposito, M.A. will discuss a framework for teacher evaluation using the International Society of Technology Education - National Educational Technology Standards. She will describe how school systems can potentially operationalize the areas of creativity and innovation, communication and collaboration, research and information fluency, critical thinking, problem solving, and decision making, digital citizenship, and technology operations and concepts. She will also discuss essential conditions to support skilled pedagogy that effectively leverages existing technology. Maria is currently teaching Instructional Technology at Molloy College in Long Island, New York. Maria has also served as a K-12 technology administrator, K-12 educator, and was a technology administrator at Cravath, Swaine, & Moore, LLP, one of the world's largest corporate law firms. Maria received her Master's Degree in Educational Communication and Technology at New York University and is currently in the dissertation phase of her Doctorate in Educational Administration at Dowling College.
Adopting Pass-Port: A Systems Approach to Changing Culture and Practices
Richard Bernato, Dowling College, bernator@dowling.edu
Richard Bernato, Ed.D. will describe the process the Dowling College School of Education undertook in 2009 to maintain its NCATE recognition status. The school recognized the need to formalize its use of a variety of data sources to diagnose and prescribe for program improvement. The process Dowling College took to choose the web-based data gathering system, Pass-Port and to weave it among the professional and leadership practices throughout the school has many systems-based and high involvement factors worthy of consideration. Dr. Bernato is the Assistant Dean for the School of Education. He is serving the school as its NCATE Coordinator and is a member of the Department of Educational Leadership, Administration, and Technology. He also consults with school districts to facilitate strategic planning, shared decision making, and school improvement reform efforts. Previously, he has served in many leadership roles in public education, such as Assistant Superintendent for Educational Services.
Supplemental Learning Online for Middle School Students: An Evaluation and Discussion
S Marshall Perry, Dowling College, perrysm@dowling.edu
S. Marshall Perry, Ph.D. will discuss a federally-funded evaluation of an online individualized tutoring service. The evaluation focused on supplemental instruction in reading and mathematics at the middle school level. Over two years, over 700 middle school students were offered 25 hours of programming, including approximately 22 hours of instruction and three hours of assessments. The evaluation included of a randomized control trial to determine the relationship between academic achievement and involvement in the program. Dr. Perry is an Assistant Professor at the Dowling College School of Education. He holds a Ph.D. in Administration and Policy Analysis from the Stanford University School of Education. He also holds a B.A. with distinction in Political Science from Yale University. Previously, Dr. Perry was a Senior Research Associate at Rockman et al, a research, evaluation, and consulting firm. Currently, he consults with public schools involved in restructuring to improve student achievement.
Online Versus Face-to-Face Learning in the College Classroom: A Discussion
Janet Caruso, Nassau Community College, jxc133@dowling.edu
Janet Caruso, M.B.A. will discuss the existing literature surrounding online learning in college courses and discuss the methodological and logistical challenges of conducting rigorous evaluations. For example, when comparing face-to-face to online classes, evaluators might have difficulty distinguishing programmatic effects from teacher effects, student characteristics, teacher orientation, testing effects, or subject matter applicability. Janet is currently the Dean of Business and Professional Education and an adjunct assistant professor at Nassau Community College. She is a member of the academic administrative team that focuses on the development of new programs, policies, and procedures which will address the future needs of the College. She has been involved in higher education for over 25 years and has served in various academic and administrative positions at other institutions including, Chair, Business Administration Department, Director, Office of Adult Learning, and Dean of Faculty. Janet is currently pursuing her doctoral degree at Dowling College.

Session Title: HIV, Tuberculosis and Pregnancy Prevention: Methods and Strategies for Evaluation
Multipaper Session 917 to be held in El Capitan A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Health Evaluation TIG
Herb Baum,  REDA International Inc, drherb@jhu.edu
Usability Testing of an Evidence-Based Teen Pregnancy/STI Prevention Program for American Indian/Alaska Native Youth: A Multi-Site Assessment
Ebun Odeneye, University of Texas, Houston, ebun.o.odeneye@uth.tmc.edu
Ross Shegog, University of Texas, Houston, ross.shegog@uth.tmc.edu
Christine Markham, University of Texas, Houston, christine.markham@uth.tmc.edu
Melissa Peskin, University of Texas, Houston, melissa.f.peskin@uth.tmc.edu
Stephanie Craig-Rushing, Northwest Portland Area Indian Health Board, scraig@npaihb.org
David Stephens, Northwest Portland Area Indian Health Board, dstephens@npaihb.org
Abstract: After a 15-year decline, the national teen birth rate in this country increased between 2005-2007; American Indian/Alaska Native (AI/AN) youth had the greatest increase (12%) compared to other ethnic groups. AI/AN youth also experience significant STI/HIV disparities compared to other US teens. As part of the formative evaluation phase of a larger study, 'Innovative Approaches to Preventing Teen Pregnancy among American Indian and Alaska Native Youth', we will conduct an intensive usability testing with AI/AN youth in three regions (n=90) to determine parameters for adaptation of an evidence-based program, 'It's Your Game, Keep It Real' (IYG), for this population. Data will be gathered on satisfaction with the user interface, ease of use, acceptability, credibility, motivational appeal, and applicability of IYG. Youth will evaluate each activity on the usability parameters, noting content, design, and thematic problems and ideas on improvement. Lessons learned from planning and implementing usability testing will be discussed.
A Comparison of Methods for Evaluating Implementation Fidelity of a Pregnancy and HIV Prevention Program
Pamela Drake, ETR Associates Inc, pamd@etr.org
Jill Glassman, ETR Associates Inc, jillg@etr.org
Lisa Unti, ETR Associates Inc, lisau@etr.org
Abstract: We used an RCT design to evaluate an online training program for over 200 educators implementing the Reducing the Risk program. Our primary outcome of interest was implementation fidelity. We used several methods for measuring implementation fidelity: educator implementation logs for each lesson, interviews with educators after implementing specific lessons, in-person observations, audio observations, follow-up interviews, and post surveys. We are analyzing and comparing all data sources to determine how valid the educator self-report logs are, which method(s) appears to provide the most valid data source, how validity appears to vary across the data sources, and what information each data source adds to the measurement of implementation fidelity. The results will provide important information for improving the quality of the various types of measures of implementation fidelity and recommendations for which sources yield the best quality data given various resource constraints.
Strategies for Evaluating Tuberculosis Control and Prevention Programs
Lakshmy Menon, Centers for Disease Control and Prevention, lmenon@cdc.gov
Awal Khan, Centers for Disease Control and Prevention, aek5@cdc.gov
Abstract: This paper will present various monitoring and evaluation strategies (of specific interventions and routine program activities) used by tuberculosis (TB) control and prevention programs in both national and international settings (low-burden and high-burden settings, non-governmental as well as governmental organizations). Case studies will be used to illustrate types of evaluations (including cost-benefit analysis, feasibility and impact studies, and quality improvement); challenges and successes encountered during the implementation of each evaluation will also be presented. Time and resources are often limited when conducting an evaluation of existing health program. This paper will highlight approaches that work so overburdened staff or evaluators can undertake a successful evaluation of their program.
From Interviews to Implementation: Conducting Formative Evaluation to Implement an Access to Care Intervention
Sarah Chrestman, Louisiana Public Health Institute, schrestman@lphi.org
Michael Robinson, Louisiana Public Health Institute, mrobinson@lphi.org
Jack Carrel, Louisiana Office of Public Health, jack.carrel@la.gov
Susan Bergson, Louisiana Public Health Institute, sbergson@lphi.org
Snigdha Mukherjee, Louisiana Public Health Institute, smukherjee@lphi.org
Abstract: Louisiana Positive Charge is part of a multi-site national evaluation assessing linkage to medical care for HIV+ individuals. Formative evaluation was conducted to understand why some people get into care and remain in care at the time of diagnosis and others do not. Key informant interviews were conducted with persons diagnosed with HIV at STD clinics to assess their diagnosis experience and linkage to care. The results were used to develop and implement the evaluation plan. Eleven of 12 respondents were satisfied with their testing experience, and all were linked to medical care. The majority reported their desire to live/be healthy as the reason they got into care. Suggested reasons for people not entering medical care were fear/denial, stigma and transportation. Topics to be discussed include results of formative evaluation, values of stakeholders, how the evaluation drives data collection/analysis, and how the values compel changes in methodology.

Session Title: Valuing Innovation in Democracy Assistance Evaluation
Panel Session 918 to be held in El Capitan B on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the International and Cross-cultural Evaluation TIG
Georges Fauriol, National Endowment for Democracy, georgesf@ned.org
Abstract: Democracy assistance presents a particular set of challenges to the field of evaluation. By their very nature, these types of projects and programs are extremely difficult to evaluate. Traditional methods are not always feasible given the conditions under which democracy assistance projects and programs take place. This has led democracy assistance organizations to explore innovative methods to monitor and evaluate their work. This panel will explore the evaluation innovations of a grantmaking organization and its four core institutes working in the field of democracy assistance around the world.
Cumulative Assessments: Innovative Evaluation of Long-term Projects and Grantees
Rebekah Usatin, National Endowment for Democracy, rebekahu@ned.org
The National Endowment for Democracy (NED) is a private, nonprofit organization created in 1983 and funded through an annual congressional appropriation to strengthen democratic institutions around the world through nongovernmental efforts. NED's grants program provides support to grassroots organizations in more than 80 countries to conduct projects of their own design. The varied political and cultural contexts of NED grantees coupled with the difficulties of attributing programmatic success to a single small grant have led the Endowment to look for innovative methods for measuring short and long term success of its grantees. This presentation will discuss the conception and implementation of a pioneering grantee self assessment process that was launched in 2010.
Innovating Intuition: Documenting Program Development and Adaptation in the Evaluative Process
Liz Ruedy, International Republican Institute, eruedy@iri.org
With more than 25 years of experience in the democracy and governance sector, the International Republican Institute has accumulated a wealth of institutional knowledge about how to design and implement effective programs. Efforts to monitor and evaluate these programs, however, depend on developing an in-depth understanding of what is often an intuitive process: identifying a specific set of needs, determining how and why proposed interventions will address those needs, and recognizing when and how a program must adapt as needs or circumstances change. In addition to developing "nuts and bolts" monitoring and evaluation tools, such as quantitative indicators, IRI is therefore applying innovative tools such as program theory framework templates, process journals, and outcome and system mapping to capture the logic and decisions that constitute the foundation of successful programs. This presentation will discuss the utility of these tools, and their applicability to larger monitoring and evaluation processes.
Evidence-based Advocacy: An Innovative Approach to Evaluating Policy Advocacy
Joel Scanlon, Center for International Private Enterprise, jscanlon@cipe.org
The Center for International Private Enterprise works in partnership with local private sector organizations to strengthen democracy through market-oriented reform. CIPE's programs aim, in the short-run, to build local private sector capacity and, longer-term, to achieve institutional reforms through policy advocacy. As a democracy assistance organization, CIPE is interested in evaluating the policy outcomes of advocacy campaigns, but also, and equally important, the quality of the partners' engagement in the political process. Policy advocacy is a relatively new focus area for evaluation, and advocacy evaluations primarily have been conducted for campaigns in the U.S. political environment. This presentation will discuss CIPE's on-going development and implementation of tools, for both CIPE and partner organizations, for monitoring and evaluating advocacy in our international programs.
Using Innovative Technologies to Bridge the DC-Field Divide in Evaluation Capacity Building
Linda Stern, National Democratic Institute, lstern@ndi.org
How does a democracy assistance organization build M&E capacity across its global programming? In 2008, the National Democratic Institute for International Affairs (NDI) set about answering this question, starting with a thorough self-assessment of its capacity to monitor, evaluate and learn from its programming. The assessment revealed a number of strengths as well as weaknesses, not least of which was a critical gap in the capacity of DC- and field-based staff to fully integrate evaluative practice into their project cycles. With support from the National Endowment for Democracy, NDI has developed an online learning portal designed to extend M&E capacity building processes, tools and resources to its staff throughout the NDI world. This presentation will briefly highlight organizational assessment methodologies and findings, and then share NDI's experience in using innovative technologies to bridge the DC-field divide.
Congruence of Principles and Practice: Innovative Evaluation of Workers Rights Programs
Dona Dobosz, Solidarity Center, ddobosz@solidaritycenter.org
How does a democracy assistance organization reconcile its mission, principles and priorities with the purpose and objectives of its donor organizations? The Solidarity Center has sought to reconcile its programmatic priorities - building the capacity of trade unions and their worker rights' allies around the world to advance labor rights and standards and improve living and working conditions; and challenging and reforming laws and practices that repress worker and human rights - with the objectives of its funders by developing evaluation that addresses both. This exercise carves out shared space for donor missions, principles and practices that are congruent with the Solidarity Center's own. This presentation will highlight how the SC assesses its performance by identifying and structuring internal evaluation points and selecting from a cross-section of external donor-driven factors to structure evaluation that reflects and responds to both the internal truths of the organization and the external realities of the donor.

Session Title: Using Research Electronic Data Capture (REDCap) for Designing a Data Collection System in the Field
Demonstration Session 920 to be held in Huntington A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Integrating Technology Into Evaluation
Teri Garstka, University of Kansas, garstka@ku.edu
Jared Barton, University of Kansas, jaredlee@ku.edu
Karin Chang-Rios, University of Kansas, kcr@ku.edu
Abstract: Collecting real-time data securely on a mobile device can pose many challenges, particularly when that data requires HIPAA compliance safeguards. Research Electronic Data Capture (REDCap) is a secure, web-based application designed exclusively to support data capture for research and evaluation. This demonstration will show how to use REDCap to design field assessments and collect data on an iPad in naturalistic settings such as in-home or in the community. This is a particularly useful way to establish rapport with participants and to ensure safe, secure data collection of sensitive information. This session will provide an overview of REDCap features and its interface with mobile devices such as iPads or netbooks. REDCap also makes it easy to import case-level data from a service agency's Management Information System (MIS) securely. We will demonstrate how this system works for our evaluation of a home visiting program for pregnant teens and their parents.

Session Title: Elephants in the Evaluation Room: Managing Evaluations Amid Clashing Values of Program Staff and Professional Evaluators
Think Tank Session 921 to be held in Huntington B on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Evaluation Managers and Supervisors TIG
Michelle Mandolia, United States Environmental Protection Agency, mandolia.michelle@epa.gov
Yvonne Watson, United States Environmental Protection Agency, watson.yvonne@epa.gov
Michelle Mandolia, United States Environmental Protection Agency, mandolia.michelle@epa.gov
Matt Keene, United States Environmental Protection Agency, keene.matt@epa.gov
Abstract: Evaluation results have the greatest potential to effect program or policy change when certain criteria have been met: 1) objectivity (quality control against subjective bias); 2) active stakeholder involvement in the evaluation; 3) attention to rigorous design issues appropriate for evaluation context; and 4) granting the evaluator license to provide nuanced interpretation of complex results. Attention to these non-exhaustive characteristics of good evaluation illustrates that sometimes these values conflict, particularly when program staff members become heavily invested in shaping the final evaluation product. Those with evaluation management oversight must manage this conflict. Evaluation managers will discuss composite case examples that illustrate the situation whereby program staff's vested interests in the evaluation interfere with the objectivity and rigor of the final evaluation. The presenters will solicit feedback and concrete strategies for negotiating this dilemma with breakout groups focusing on managing evaluation process, evaluation design development, evaluation personnel, and evaluation results.

Session Title: Evaluating the Wicked: Implications of the "Wicked Problem" Concept for Program Evaluation and Organizational Leadership
Panel Session 922 to be held in Huntington C on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Non-profit and Foundations Evaluation TIG
Gayle Peterson, Headwaters Group Philanthropic Services, gpeterson@headwatersgroup.com
Abstract: Many of the issues addressed by philanthropies and public agencies that hire evaluators - issues like poverty, global epidemics, food security, and climate change - are classic "wicked problems": they have multiple root causes and defy clear definition; attempted solutions are a matter of judgment and are likely to generate new, unpredictable problems; and they involve many diverse stakeholders, all of whom have different ideas about the problem and its solutions. Since 1973, when Rittel and Weber introduced the idea, the concept of wicked problems has stimulated a large body of research and practice in fields such as management, planning and organizational behavior. The presenters suggest that the concept has important implications for the values of the agencies attempting to address them, as well as for the design and conduct of program evaluations.
The Culture of Leadership in the Context of Wicked Problems
Gayle Peterson, Headwaters Group Philanthropic Services, gpeterson@headwatersgroup.com
The values underlying an agency's culture and the way it undertakes its business reflects its leadership. Gayle Peterson will provide an overview of leadership issues which an evaluation team needs to consider when designing, implementing, and reporting on evaluations in the context of wicked problems. Co-founder of the Headwaters Group Philanthropic Services, Peterson brings more than two decades of experience in philanthropic leadership and multi-sector collaboration. She is currently an Associate Fellow at Oxford University's Said Business School, where she is exploring innovative approaches to global philanthropy. Peterson and Sherman are writing a book on wicked problems in philanthropy, "Good , Evil , Wicked: The Art, Science, and Business of Giving," which draws on interviews with 1000 global givers. Peterson's presentation will set the stage for the next two presenters.
Evaluation of Leadership and Organizational Values and Culture
John Sherman, Headwaters Group Philanthropic Services, jsherman@headwatersgroup.com
Evaluating wicked problems starts with evaluating the leadership and organizational cultures of the agencies attempting to solve them. John Sherman will discuss evaluation approaches and experiences he and his firm (Headwaters Group) have used in evaluating leadership and organizational cultures, and the lessons learned from those efforts. With experience in managing nonprofits and foundations as well as over fifteen years in evaluating them, Sherman brings first-hand leadership and on-the-ground evaluation experiences along with an empathetic understanding of how to communicate evaluation approaches and lessons to leaders.
Evaluation and Double-Loop Learning
Edward Wilson, Headwaters Group Philanthropic Services, ewilson@headwatersgroup.com
With 25 years of experience in evaluation research and a background in Social Systems Sciences, Ed Wilson (currently a Senior Evaluation Fellow at Headwaters) makes the case for an approach to evaluation that encourages critical reflection on problem frames - the assumptions and values that underlie the formulation of a wicked problem and the choice of solutions. Evaluators typically set out to assess the effectiveness of programs without systematically questioning underlying assumptions and values. Such "single loop learning" evaluations play a useful role, but there is an equally important role for evaluations aimed at "double-loop learning," which engage stakeholders in processes to reveal, question and rethink underlying problem frames. In fact, many evaluations serve to facilitate double-loop learning even though they are not explicitly designed to do so. Wilson suggests that a more intentional approach to double-loop learning can help organizational leaders and evaluators more effectively cope with "wicked" complexity.

Session Title: Evaluating Technical Assistance Providers: Beyond Effectiveness to Measuring Impact
Panel Session 923 to be held in La Jolla on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Cluster, Multi-site and Multi-level Evaluation TIG
John Bosma, WestEd, jbosma@wested.org
Abstract: WestEd is conducting an evaluation of three comprehensive centers, one that serves a single state, one that serves a region, and one that provides content expertise to the other centers. Although unique in how each center operates, WestEd identified an evaluation framework that cuts across all three technical assistance centers. The panel will present a three-tier framework for evaluation focused on 1) the assistance the centers provided, 2) how well the centers provided assistance, and 3) the overall impact of the assistance the centers provided. This framework acknowledges the unique features of each center and examines the effectiveness of each center's technical assistance and respective impact.
Technical Assistance Centers - Making Sense of How They Work
Marycruz Diaz, WestEd, mdiaz@wested.org
Valentin Pedroza, WestEd, vpedroz@wested.org
Technical assistance centers provide myriad services across many projects and goals. Their services may entail presenting research-based practices at a one-day meeting or working directly with clients over several years to develop structures and processes aimed at reaching long-term goals. This presentation will discuss methods for keeping abreast of assistance center activities and working with project staff to ensure activities are captured as part of the evaluation. The discussion will be framed around working with and capturing the unique characteristics of the three centers WestEd evaluates.
Evaluating the Effective Delivery of Technical Assistance
Sharon Herpin, WestEd, sherpin@wested.org
Adrienne Washington, WestEd, awashin@wested.org
Determining how well clients provide their services is a core component of any evaluation. These data can feed formative findings, assess fidelity of implementation, and provide key contextual information necessary for understanding the outcomes of technical assistance. This presentation will explain how research about best practices in providing technical assistance and professional development were compiled into a set of principles and criteria for effectively measuring how well providers perform their jobs. This model goes beyond the simple measures of quality, relevance, and usefulness to capture a more robust view the services provided and the effectiveness of methods used to provide assistance.
Beyond Effective Delivery - Measuring Impact
Juan Carlos Bojorquez, WestEd, jbojorq@wested.org
Valentin Pedroza, WestEd, vpedroz@wested.org
Assessing how well services are provided is an important component of evaluation, but only tells one part of the story. The required quality, relevance, and usefulness indicators miss the bigger picture of what the technical assistance centers accomplished and their effect on recipients. This third tier of the evaluation framework presents a model for documenting accomplishments and measuring change at various levels, such as with individuals, teams, departments, and organizations. The model also looks at different types of capacities and methods for measuring the impact of technical assistance on those capacities.

Session Title: The Science of Team Science: Advances in the Evaluation of Interdisciplinary Team Science From the National Institutes of Health (NIH) National Evaluation of the Interdisciplinary Research Consortia
Panel Session 924 to be held in Laguna A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Research on Evaluation
Jacob Tebes, Yale University School of Medicine, jacob.tebes@yale.edu
Abstract: Interdisciplinary team science addresses complex public health challenges that cannot be addressed by a single discipline. Its emergence has fostered a new type of evaluation research: the science of team science. This emerging field focuses on understanding the structures, processes, and outcomes of team-based research, its benefits and limitations relative to single-discipline inquiry, and its capacity for achieving accelerated scientific innovation. In recent years, the National Institutes of Health, through its Roadmap for Medical Research, funded nine programs in its Interdisciplinary Research Consortium (IRC) program to conduct interdisciplinary team science. Each IRC included both local and national evaluation components, thus offering a unique opportunity for advancing the science of team science. This panel session describes methods and results of the national evaluation and selected local evaluations, presents essential tools and resources for use by evaluators, and discusses current trends and future directions in the science of team science.
Advancing the Science of Team Science Through the NIH National Evaluation of Interdisciplinary Research Consortia
Sue Hamann, National Institutes of Health, sue.hamann@nih.gov
The NIH Roadmap for Medical Research funded the Interdisciplinary Research Consortia (IRC) Program in order to transform biomedical research. IRCs sought to utilize interdisciplinary team science to study multiple health and disease conditions that could not be addressed by a single discipline. The IRC Program within NIH enlisted multiple institutes to fund research and developed a unique funding mechanism that linked individual R01 projects across institutes to one another for each of the nine consortia that were funded. The NIH also commissioned a national evaluation of the IRC program to identify its effectiveness and to make recommendations for policy regarding continued funding of interdisciplinary team science. This presentation, conducted by the director of the NIH national evaluation, provides an overview of the national evaluation, summarizes its major findings, and discusses its implications for subsequent evaluations in the science of team science.
Implications of the National Evaluation of Interdisciplinary Research Consortia for Infrastructure Development and Training in Interdisciplinary Team Science
Alina Martinez, Abt Associates Inc, alina_martinez@abtassoc.com
Because interdisciplinary team science is a relatively new area of inquiry, structures to enhance its sustainability remain embryonic. One emphasis in the national evaluation of Interdisciplinary Research Consortia was identifying the essential infrastructure for interdisciplinary team science, including interdisciplinary training. NIH contracted with Abt Associates to conduct the national evaluation and to focus significant evaluation resources to examine infrastructure development and training. This presentation, conducted by the lead evaluator of the national evaluation, summarizes a mixed methods evaluation design to examine various types of essential infrastructure to support interdisciplinary team science, describes the key process and outcome results of the evaluation, and discusses the implications of the findings for future evaluations of the science of team science as well as for the development of interdisciplinary team science.
An Illustration of Social Network Analysis in the Evaluation of Interdisciplinary Team Science
Irina Agoulnik, Brigham and Women's Hospital, irina@syscode.med.harvard.edu
Social network analysis assesses the extent and strength of interactions among individuals or groups within a system. This type of analysis has become an important tool in understanding the work among scientists engaged in interdisciplinary team science. This presentation, conducted by one of the nine Consortia funded by the NIH Interdisciplinary Research Consortia Program, summarizes its own independent, local evaluation that emphasized the use of social network analysis. As part of the local evaluation, social network analysis was used to identify areas for strengthening consortium interactions and promote interdisciplinary collaboration. This presentation, conducted by the lead evaluator of one of the IRC local evaluations and a member of the national evaluation work group, describes the results of the social network analysis, its use as a tool in evaluating interdisciplinary team science, and its utility more generally in the science of team science.
Advancing the Science of Team Science Through Comprehensive Mixed Methods Approaches
Jacob Tebes, Yale University School of Medicine, jacob.tebes@yale.edu
Evaluation in the science of team science is complex because it involves the assessment of processes, outcomes, and impacts of multiple teams that are usually working in various remote sites. In addition, since team members have different backgrounds, their views of the nature of the scientific challenges facing the team are often based on different assumptions and perspectives. For the evaluator interested in understanding this complex terrain, the use of mixed methods approaches is essential. Mixed methods allow for the systematic integration of qualitative perspectival data with quantitative productivity data to assess the quality and extent of innovation in team science. This presentation, by the lead evaluator of one of the local IRC evaluations and a member of the national evaluation work group, describes a comprehensive mixed methods evaluation of interdisciplinary team science, summarizes local evaluation findings, and illustrates key approaches for use in the science of team science.

Session Title: Utilizing Values and Context to Build Global Program Evaluation Competency: CDC's Field Epidemiology and Training Program (FETP) for Non-Communicable Diseases and Injuries
Think Tank Session 925 to be held in Laguna B on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the International and Cross-cultural Evaluation TIG
Sue Lin Yee, Centers for Disease Control and Prevention, sby9@cdc.gov
Sue Lin Yee, Centers for Disease Control and Prevention, sby9@cdc.gov
Andrea Bader, Centers for Disease Control and Prevention, vbu6@cdc.gov
Abstract: CDC's Center for Global Health is developing training curricula on non-communicable diseases (NCDs) for Field Epidemiology Program (FETP) fellows in five countries. Participants may enroll in basic NCD courses such as epidemiology, surveillance, prevention and control, and advanced topics that include evaluation of interventions and surveillance systems. 2nd year fellows may work with mentors to apply their evaluation knowledge/skills in the field by evaluating an intervention/program using field product guidelines. The National Center for Injury Prevention and Control and Center for Global Health seek to examine two questions: 1) How can program evaluation modules effectively integrate the values and country context of fellows to enhance learning? and 2) How can field product guidelines better promote experiential and continued learning? Participants will be invited to provide feedback on the recently piloted evaluation module and offer insights on tweaking the approach or materials that could further enhance learning and application.

In a 90 minute Roundtable session, the first rotation uses the first 45 minutes and the second rotation uses the last 45 minutes.
Roundtable Rotation I: Collaborative Evaluation to Enhance Undergraduate Coursework and Prepare Future Teachers
Roundtable Presentation 926 to be held in Lido A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Assessment in Higher Education TIG
Leigh D'Amico, University of South Carolina, damico@mailbox.sc.edu
Vasanthi Rao, University of South Carolina, vasanthiji@yahoo.com
Abstract: The University of South Carolina and Midlands Technical College are collaborating to improve undergraduate coursework to better prepare pre-service teachers to effectively educate students with differing needs. Faculty members at both institutions have been evaluating syllabi of core early childhood education courses to examine content and share strategies and resources to improve course delivery and student assessment. Upon the conclusion of the syllabi evaluation and redesign, faculty will evaluate implementation of the enhanced syllabi to refine and further improve pre-service teacher preparation programs. Goals of the project include facilitating communication and collaboration between 2-year and 4-year higher education institutions, understanding the needs, values, and realities of undergraduate students, providing coursework and preparation that promotes young childrens' growth and development, and preparing pre-service teachers to effectively work in classrooms with diverse students with differing needs.
Roundtable Rotation II: Finding Chemistry With Science Faculty: Engaging Stakeholders in Evaluation of Student Learning
Roundtable Presentation 926 to be held in Lido A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Assessment in Higher Education TIG
Jennifer Lewis, University of South Florida, jennifer@usf.edu
Abstract: Successful learning in a college-level biochemistry course depends on correct understanding of a number of basic concepts from general chemistry and biology, but there are few existing high-quality measures. To further complicate the situation, college science faculty, who are key stakeholders, often do not value assessment. This talk discusses the collaborative process of instrument design and development undertaken as part of the evaluation of a curriculum reform project in biochemistry involving biology, chemistry, and biochemistry faculty from multiple institutions. The most current results across the project and the implications of this work will be discussed, including the importance of the collaborative development process as professional development for college science faculty and the value of using a pre/post diagnostic instrument to maintain awareness of the need for the project's work to continue.

Session Title: Evaluation Within Complex Health Systems
Multipaper Session 927 to be held in Lido C on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Health Evaluation TIG
Kim van der Woerd,  Reciprocal Consulting, kvanderwoerd@gmail.com
Factors Associated with First Trimester Prenatal Care: An Evaluation of Presumptive Eligibility in Five Colorado Local Health Agencies
Mario Rivera, Colorado Department of Public Health and Environment, mario.rivera@state.co.us
Abstract: Presumptive eligibility (PE) is federal legislation that allows eligible pregnant women to receive 45 days of temporary medical coverage through Medicaid and CHP+ while eligibility for full health care benefits is determined. During 2009-2010, an evaluation of PE at the five local health agencies was conducted. The evaluation determined the extent to which women who receive PE services obtain prenatal care during their first trimester of pregnancy and what characteristics are associated with receipt of first trimester care. A 20-item telephone survey was the data source. The survey examined the utility of enrolling individual clients into PE services and provided an opportunity for local public health agencies to learn how evaluation can help shape local public health programming. Conclusions drawn from the evaluation suggest that PE services in conjunction with a broader system of services may work more effectively than a focus on one-on-one application services for pregnant women.
Triangulating a Three-Legged Stool: Self- and Other-Assessments of the Three Components of the North Carolina Public Health Preparedness System
Doug Easterling, Wake Forest University, dveaster@wfubmc.edu
Lucinda Brogden, Core Path Solutions LLC, lubrogden@earthlink.net
Abstract: Preparedness for and response to disasters has become a core responsibility of public health agencies. In North Carolina, public health preparedness is coordinated and carried out by three levels of actors: the NC Division of Public Health, 85 local health departments, and a set of intermediate groups (Public Health Response Teams) that provide technical assistance to local health departments. A system-wide assessment was commissioned in 2009 to inform the refinement and possible restructing of the system. That assssment asked stakeholders at each level of the system to report on the strengths and limitations of their own agency and of the other levels of the system. This paper will present results that focus specifically on the congruence between self-assessments and external assessments. In some areas, the reports were largely in agreement (e.g., the relative strength of different forms of organizational capacity within local health departments). However there were also areas where outside stakeholders identified major weaknesses not reported by those within the respective agency.
Windows of Opportunity for Improving Public Health Insurance Coverage In Mexico: Main Findings of a Qualitative Evaluation of the Implementation of Seguro Popular
Adolfo Martinez-Valle, Coordinacion Nacional Programa Oportunidades, adolfomartinezvalle@gmail.com
Abstract: Objetive. Identify windows of opportunity by assessing the implementation of Seguro Popular de Salud (SPS) to improve its effectiveness both at the estate and federal level, as well as the challenges facing its consolidation in the year 2012 when universal coverage should be achieved. Methods. A qualitative evaluation was performed between March and December of 2006 to assess the implementation of SPS in 13 states where approximately 60 percent of the national coverage was achieved in Mexico. Results. Ensuring financial resources for prevention is a key window of opportunity to enhance the financial feasibility of SPS. More effective and timely financial flows will improve health care delivery by increasing drugs supply and other health care inputs. Certifying health care units financed by SPS is crucial to ensure a minimum quality of care. Conclusions. The evaluation strategy allowed identifying key windows of opportunity to improve the implementation of SPS.

Session Title: Measuring Research Interdisciplinarity and Knowledge Diffusion
Multipaper Session 928 to be held in Malibu on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Research, Technology, and Development Evaluation TIG
Alan Porter, Georgia Tech and Search Technology Inc, alan.porter@isye.gatech.edu
Jan Youtie, Georgia Tech, jan.youtie@innovate.gatech.edu
Abstract: Interest in the attributes of cross-disciplinary research and in the distribution of research knowledge is strong. This has inspired introduction of several new measures of interdisciplinarity and research diffusion. This session brings together several explorations of the Integration and Diffusion scores, along with other measures and visualizations, to help understand their behavior. The first paper introduces the Diffusion score and examines its behavior in a substantial benchmarking exercise, augmented by in-depth lab studies. The second paper investigates development of a companion Integration score to gauge the diversity of patent sets. The third paper applies Integration and other scoring in research program assessment. It compares researcher and proposal level variations of the metric. We then discuss the opportunities and limitations in applying these measures on behalf of research evaluation -- e.g., sensitivities to disciplinary citation norms, Web of Science coverage, temporal distributions, and so forth.
A New Measure of Knowledge Diffusion
Stephen Carley, Georgia Tech, stephen.carley@gmail.com
Alan Porter, Georgia Tech and Search Technology Inc, alan.porter@isye.gatech.edu
The Diffusion score is a new interdisciplinary metric to assess the degree to which research is cited across disciplines. It is the analog to the Integration score, which measures diversity among a given publication's cited references. Together these metrics enable tracking the transfer of research knowledge across disciplines and citation generations. The two scores share a consistent formulation based on distribution of citations over Web of Science Subject Categories (SCs). The Integration score measures diversity among cited SCs; the Diffusion score measures diversity among citing SCs. Here we study the behavior of Integration and Diffusion scores for benchmark samples of research publications in six major fields (SCs) spanning interests of the National Academies Keck Futures Initiative (NAKFI). We also probe their behavior via two laboratory level analyses. Through long term observation of these labs, and interviews with their senior investigators, we explore "exactly what" Integration and Diffusion scores are tapping.
Analyzing the Effect of Interdisciplinary Research on Patent Evaluation: Case Studies in Nbs and Dsscs
Wenping Wang, Beijing Institute of Technology, wangwenping1009@gmail.com
Alan Porter, Georgia Tech and Search Technology Inc, alan.porter@isye.gatech.edu
Ismael Rafols, University of Sussex, i.rafols@sussex.ac.uk
Nils Newman, Intelligent Information Services, newman@iisco.com
Yun Liu, Beijing Institute of Technology, liuyun@bit.edu.cn
Policies facilitating interdisciplinary research (IDR) appear to be based more on conventional wisdom than empirical evidence. This study examines whether IDR leads to higher technological performance. Patents, as major outputs of technological invention, are adopted as the representative measure of technological performance. To look into the relationship between IDR and patent evaluation, we address "patent quality" and "IDR." Disciplinary diversity indicators of patents are developed from the properties of variety, balance, and similarity. Basing the research on patent abstract documents, we evaluate patent quality along three dimensions: technology, market, and legal. We then examine correlations between the diversity and patent quality measures. The case study builds on the emerging domains of Nanobiosensors and Dye-Sensitized Solar Cells. By devising patent metrics commensurate with publication measures, comparison also informs the relationship between research (publication) and patent activity.
Measuring Interdisciplinarity: A Unique Comparison Between the Researcher and Research Proposal
Asha Balakrishnan, IDA Science & Technology Policy Institute, abalakri@ida.org
Vanessa Pena, IDA Science & Technology Policy Institute, vpena@ida.org
Bhavya Lal, IDA Science & Technology Policy Institute, blal@ida.org
Measuring interdisciplinarity of the researchers and research teams is of major interest to agencies funding interdisciplinary research programs. One particular program funding potentially transformative research through interdisciplinary team science at a key R&D agency requested the Science and Technology Policy Institute (STPI) to conduct an assessment of the program's awards funded between FY2007 to FY2009. In this talk, we present a unique analysis that compares the interdisciplinarity of individual researchers compared with the interdisciplinarity of their awarded proposals funded by this particular program. We will describe the methodology behind measuring both the interdisciplinarity of the principle investigator's publication history and interdisciplinarity of the proposals, and compare how a PI's interdisciplinarity maps to his/her proposal interdisciplinarity.

Session Title: Juxtapositions in Evaluations With Explicit Human Rights and Social Justice Values
Panel Session 929 to be held in Manhattan on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the AEA Conference Committee
Donna Mertens, Gallaudet University, donna.mertens@gallaudet.edu
Abstract: When evaluators position themselves within an explicit social justice and human rights value system, this has implications for how the evaluation is planned, implemented, and used. Frequently, tensions arise when these values are made explicit for a variety of reasons based on differences in perceptions of various stakeholder groups related to the need for evaluations to be value-free; the purpose of doing an evaluation when the project is over; views about the purpose of the program; and effects of including culturally appropriate rituals and manners of interaction on the rigor of the evaluation. Each of these reasons for tensions will be illustrated by means of juxtaposing the differences in viewpoints, followed by presentations to address each of these reasons by means of role plays, use of data to stimulate action, visual displays, and re-enactment of appropriate cultural rituals and manners of interaction. The focus will be on constructive strategies for addressing the tensions in evaluations that are explicitly value-laden in evaluations with ethnic/racial minorities, deaf people, people with disabilities, women in developing countries, and indigenous peoples.
The Juxtaposition of Competing Values Held by Funders and Program Designers as Compared to Members of Marginalized Communities
Katrina Bledsoe, Education Development Center Inc, katrina.bledsoe@gmail.com
The first presentation, by Katrina Bledsoe, will address the juxtaposition of competing values held by funders and program designers as compared to members of marginalized communities. She has experienced evaluations in which the program funders and program designers view the value of a program in terms of benevolence (charity), whereas members of marginalized communities frame the value of the program in terms of social justice (equity). These differences at the starting point of an evaluation call upon the evaluator to creatively make visible the differences in values and the consequences of working from either a benevolence or social justice perspective. Bledsoe will use visual displays and role plays to illustrate how to situate applicable evaluations in terms of their historical context through films, pictures, tours of areas, discussions with elders as well as young folks, and written historical accounts. Her focus is on how she addresses differences in values and the associated consequences in terms of critically analyzing what the program is designed to do, how the program will accomplish its goals, and how the evaluation will contribute to the process. Through this presentation that makes visible historical injustices, she will demonstrate the value marginalized community members place on recognizing their strengths and resiliency.
Differences in Viewpoints With Regard to the Purpose of an Evaluation When Stakeholders Know the Funded Program is Ending
Donna Mertens, Gallaudet University, donna.mertens@gallaudet.edu
The second presentation by Donna M. Mertens will address the differences in viewpoints with regard to the purpose of an evaluation when stakeholders know the funded program is ending. Set within the context of a program to prepare teachers for deaf children who have a disability, she will demonstrate through role plays how the data collected in the middle of the evaluation can be used to stimulate social action to address issues of inequities based on discrimination against people who have a disability or come from homes where English is not the first language.
Motivating Evaluators to be Aware of the Need for Appropriate Support for People With Disabilities Who Participate in an Evaluation
Linda Thurston, National Science Foundation, lthursto@nsf.gov
The third presenter, Linda Thurston, will demonstrate how to appropriately motivate evaluators to be aware of the need for appropriate support for people with disabilities who participate in the evaluation. She will demonstrate through visual displays the barriers that prevent people with disabilities from fully participating in Science, Technology, Engineering, and Mathematics (STEM) projects and how evaluators can serve as the catalyst for appropriate accommodations being provided.
Supporting the Argument for Gender Equality and Human Rights in Evaluation
Belen Sanz, UN Women, belen.sanz@unwomen.org
Inga Sniukaite, UN Women, inga.sniukaite@unwomen.org
The fourth presentation, by Belen Sanz and Inga Sniukaite, is based on their work in gender equality and human rights which are central to the United Nations mandate to address the underlying causes and utilizing processes that align with these values. The UN Women staff support a value-based evaluation approach, however, evaluation managers and evaluators themselves struggle to make these values an integral to their work because of budget and time constraints, as well as because of the presumption that evaluations should be value free. The two presenters will use examples from UN Women evaluations to illustrate through role plays how they support the argument of the need to integrate gender quality and human rights into the evaluations that they fund.
How the Inclusion of Culturally Appropriate Rituals and Manners of Interaction Contribute to the Validity of Evaluations
Kataraina Pipi, Independent Consultant, kpipi@xtra.co.nz
Kataraina Pipi will be the final presenter; she will illustrate how the inclusion of culturally appropriate rituals and manners of interaction contribute to the validity of evaluations conducted with the Maori community in New Zealand. She will demonstrate the cultural rituals and provide examples of how music is used as part of a culturally appropriate evaluation in this context.

Session Title: Tools and Ideas for Mainstreaming Evaluation
Demonstration Session 930 to be held in Monterey on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Graduate Student and New Evaluator TIG
Amy Gullickson, Western Michigan University, amy.m.gullickson@wmich.edu
Abstract: As an evaluator, helping an organization or program mainstream evaluation into its culture, thinking, practices, and systems enables you and the staff to get and use information to (i) create strategies and program designs that meet the needs of your stakeholders, (ii)improve processes and programs, and (iii) understand outputs, outcomes, and impact. In my dissertation research, I studied four organizations that were mainstreaming evaluation. I witnessed how integrating evaluation into the daily life of the staff benefited all through reduction of the evaluation workload, and increases in the accessibility of good data, staff's appreciation of evaluation processes, and their use of findings. Attendees of this demonstration session will learn about a variety of tools and ideas found in my research that they can adapt, integrate, and use in their own evaluation efforts to realize the aforementioned benefits.

Session Title: Rights and Poverty Assessments: Using New Impact Assessment Tools for Community-Company Engagement and Accountability
Panel Session 931 to be held in Oceanside on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Business and Industry TIG
Gabrielle Watson, Oxfam America, gwatson@oxfamamerica.org
Abstract: We present two experiences using new approaches to assess the impacts of private sector actors on local communities in order to foster greater understanding and an improved basis for engagement and improved development outcomes. The first, Human Rights Impact Assessments (HRIAs), have emerged in recent years as a new tool for corporate accountability. Oxfam America has used a community-based methodology, developed by Rights & Democracy of Canada, to put knowledge and power in the hands of communities and the organizations working with them. The second, Oxfam's Poverty Footprint methodology, builds on concepts like 'food miles' and 'environmental footprints', to assess how businesses impact communities touched by their value chains. Panelists present Oxfam's interest in pursuing these assessment methodologies and the experiences of organizations in applying them in actual cases.
Piloting Community-based Human Rights Impact Assessments at Oxfam America
Maria Ezpeleta, Oxfam America, mezpeleta@oxfamamerica.org
Oxfam America supported two partner organizations to adapt a community-based HRIA methodology developed by Rights and Democracy (R&D) of Canada. Two regional teams opted into the pilot initiative to test the HRIA methodology - one involving migrant tobacco pickers in the US, and one involving indigenous communities affected by gas exploration in Bolivia. Oxfam and R&D staff supported local partner organizations to use and adapt the assessment methodology, providing remote support, periodic field visits, and one face-to-face convening of all the project teams. Surveys also assessed learning, usability and gathered feedback on the process. In the final stages of the pilot, a stock-taking exercise helped Oxfam leadership assess what level of investment to put into additional learning and dissemination of the methodology to other Oxfam programs and regions. The panelist describes the purposeful evaluative and learning processes accompanying the pilot program and the lessons derived from them.
Communities Using Human Rights Impact Assessments to Shift Power
Sarah Zipkin, Oxfam America, szipkin@oxfamamerica.org
The Farm Labor Organizing Committee (FLOC) adapted Getting it Right, the Human Rights Impact Assessment methodology developed by Rights & Democracy, to document working conditions of migrant and undocumented tobacco pickers for RJ Reynolds and other tobacco suppliers in North Carolina. Findings revealed substandard housing, long hours of grueling work, daily threats of pesticide poisoning, heatstroke and repetitive stress injuries. The tool helped FLOC generate a report, and together with Oxfam, FLOC has publicized the findings with shareholders, local, state and federal policy-makers. Sarah Zipkin, Oxfam's Regional Advisor for Private Sector Engagement, worked with FLOC throughout the pilot, helping develop survey methodology, sampling frames and interview protocols. She presents how FLOC staff adapted Getting it Right across language, cultural, and legal contexts, and shares how they are using the assessment as part of ongoing campaign efforts to improve working conditions of tobacco pickers.
Poverty Footprint Studies: Using Impact Studies to Increase Accountability and Improve Development Outcomes
Chris Jochnick, Oxfam America, cjochnick@oxfamamerica.org
Oxfam developed a poverty footprint methodology that assesses impacts of company activities on livelihoods, health and well-being, diversity and gender, empowerment, and security. Oxfam collaborated with a major food & beverage company to test the methodology in communities in Zambia and El Salvador. The purpose of the studies was for NGOs, companies and stakeholders to understand impacts on people throughout the value chains. Corporate disclosure already exists around governance, financials, environmental and labor practices. Oxfam aims to extend transparency and accountability to poverty impacts, to create a platform for community engagement and opportunities to improve development outcomes for both businesses and communities. The study identified opportunities for creating improvements around labor, women's empowerment, water quality and scarcity, and marketing. For Oxfam, the purpose was to surface impacts and spur engagement. The panelist presents how poverty footprints succeeded in achieving this, the challenges, and lessons being applied going forward.

Session Title: What Works - and for Whom? Value and Relevance in Arts and Culture Evaluation
Multipaper Session 932 to be held in Palisades on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Evaluating the Arts and Culture TIG
Kathleen Tinworth,  Denver Museum of Nature and Science, kathleen.tinworth@dmns.org
Design Squared: Evaluation Designs for Design Education
Helene Jennings, ICF Macro, hjennings@icfi.com
Caroline Payson, National Design Museum, paysonc@si.edu
Abstract: Over the past six years ICF Macro has supported a Smithsonian museum in designing evaluations for their expanding portfolio of professional development in design education for K-12 teachers. The Education Department of the Cooper-Hewitt, National Design Museum have been implementing hands-on teacher training in design education through week-long seminars in New York City and other settings that offer rich resources for design thinking (such as New Orleans and San Antonio). The design process can lead to a deep understanding of abstract concepts taught in schools by carrying out authentic tasks. The partners have developed and applied a range of evaluation methods for formative purposes and to assess varied outcomes, depending on educator interests and how design thinking is brought into the curriculum or school environment. Examples of 'soft' measures as well as a year-long study using data from state math and language arts tests will be presented.
Visitors' Perceptions of the Value of Cultural Institutions: A Multi-Institution Study of Ohio Zoos
Victor Yocco, Institute for Learning Innovation, yocco@ilinet.org
Joe Heimlich, Ohio State University, heimlich@ilinet.org
Abstract: Informal learning venues including art museums, history museums, zoos, etc. must continually seek and provide evidence of the value they provide to visitors and the local community. In this session the authors present findings from a study examining visitors' perceptions of the value of six zoos located in Ohio. Using an instrument developed and tested by the presenters, we explore the differences found both within and between zoos for three value categories identified in the literature: individual, societal, and economic. We will discuss how the data were collected using trained staff and volunteers at each of the organizations and the role of the evaluators in overseeing this process, analysis of the data, and reporting of the findings. Finally, we will discuss some of the differences between and within the participating organizations and how these and other organizations may use the findings to inform future practice and efforts at measuring value.

Session Title: Using Evaluation to Support Innovation and Strategic Planning in Large Scale National and International Environmental Programs
Multipaper Session 933 to be held in Palos Verdes A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Environmental Program Evaluation TIG
Kara Crohn,  Research Into Action, karac@researchintoaction.com
Evaluation of Adaptation Programs to Climate Change Impacts
Claudio Volonte, Global Environment Facility Evaluation Office, cvolonte@thegef.org
Abstract: The Global Environment Facility has provided $50 million in support of investments (22 projects) that aimed at reducing vulnerability and increasing adaptive capacity to the adverse effects of climate change, particularly within natural resources management. The GEF Evaluation Office conducted an evaluation of these program to be submitted to the GEF Council to provide lessons to further development of decisions within the GEF regarding adaptation. The evaluation provides some innovative methodologies to evaluate adaptation programs, findings on the achievements so far, relevance of the program to national priorities and the UN convention on climate change, efficiency of program implementation and sustainability of its outcomes. The findings and recommendations were considered and accepted by the GEF governing body. The evaluation is available in the GEF Evaluation Office website (http://www.thegef.org/gef/node/3726).
Supporting Innovation and Innovators: Ruminations on the Role of Developmental Evaluation in the National Park Service
Jennifer Jewiss, University of Vermont, jennifer.jewiss@uvm.edu
Abstract: As the National Park Service (NPS) prepares for its centennial in 2016, a commission was convened to conduct a yearlong appraisal of the national park system. The Second Century Commission's report, Advancing the National Park Idea (2009), noted that individual parks have developed many innovative programs. The commission identified a pressing need to share programmatic innovations more effectively across the system and support adaptation of innovations to suit highly varied local contexts. An NPS research and think tank, the Conservation Study Institute, has been exploring how it might help address these needs. Several recent projects conducted in partnership with the University of Vermont have taken a developmental evaluation approach. This paper reflects on lessons learned regarding the role that developmental evaluation can play as the NPS seeks to support innovators and advance the sharing of innovation. Insights from the emerging literature on developmental evaluation are featured alongside the author's reflections.
Proposal for a Qualitative Evaluation Method for Environmental Policy Regarding Climate Change
Kiyotaka Nakashima, , knakashi@iwate-u.ac.jp
Abstract: This presentation proposes a qualitative method for the evaluation of environmental policy regarding climate change ("climate policy"). This evaluation method emphasizes a sequential and systematic outlook, which attaches importance to the interrelatedness (synthesis and interdisciplinarity), and the process (history) of climate policy. Firstly, a general evaluation framework is established for evaluating climate policy through interdisciplinary review of the academic research on social science in environmental policy. Next, the processes of international negotiation and commitment implementation are evaluated by applying the established framework. The point at issue of climate policy is examined by being related to the articles of the international agreements and the evaluation criteria of the international cooperation. This presentation proposes a qualitative method for evaluating climate policy which considers the interrelatedness of multiple research objects through feedback between theoretical and empirical research, or between international negotiation and commitment implementation processes.

Session Title: Appreciative Inquiry and Evaluation: The "What?" and the "How?" in Building Evaluation Capacity
Skill-Building Workshop 934 to be held in Palos Verdes B on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Organizational Learning and Evaluation Capacity Building
Anna Dor, Claremont Graduate University, annador@hotmail.com
Abstract: Appreciative Inquiry (AI) is a phenomenon that looks at what is working right in organizations by engaging people to look at the best of their past experiences in order to imagine the future they want and find capacity to move into that future. In AI, language that describes deficiencies is replaced by positive questions and approaches. While AI has been as an organizational behavior intervention, its application as an evaluation tool leads to building evaluation capacity and enabling an organization to become a learning organization. In AI all stakeholders are actively involved in the evaluation process which eliminates the "us" vs. "them" perception. This workshop is designed to help participants learn about Appreciative Inquiry (AI) and how they can facilitate an AI in their organization as internal/external evaluators, consultants, and leaders.

Session Title: Making Contribution Analysis Work: The Benefits of Using Contribution Analysis in Public Sector Settings
Multipaper Session 935 to be held in Redondo on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Program Theory and Theory-driven Evaluation TIG
Steve Montague, Performance Management Network, steve.montague@pmn.net
Abstract: John Mayne's introduction of contribution analysis (CA) has attracted widespread attention within the global evaluation community. Yet, despite this vivid attention, as Mayne himself notes (2011), there haven't been a lot of published studies that involve the systematic application of contribution analysis. This begs a number of questions: How can we bridge the divide between the sustained interest in and actual application of CA? How can we make CA work? What are the actual benefits of CA? In this panel the aim is twofold. First, contributors to this debate will convene and discuss the similarities and differences of their practical implementation of contribution analysis - how to make contribution analysis work. Second, they will discuss the benefits of using contribution analysis.
Contribution Analysis: What is it?
Sebastian Lemire, Ramboll Management Consulting, setl@r-m.com
Steve Montague, Performance Management Network, steve.montague@pmn.net
Contribution analysis appears to have strong potential in the dynamic and complex environments typically faced in public enterprise. The first presentation briefly outlines the main steps and methodological tenets of contribution analysis. How is it done? How is it different from other approaches in theory-driven evaluation? What types of causal claims are supported by CA? The presentation will introduce some of the key challenges to making CA work in public sector settings.
The Participatory Approach to Contribution Analysis
Steve Montague, Performance Management Network, steve.montague@pmn.net
The use of a participatory approach to contribution analysis in the form of 'Results Planning' transforms CA into a 'generative' group learning (and at least partially inductive) process. The presentation will showcase a means of telling the performance story and an 'alternative' approach to the judging of evidence for attribution. The notion developed in this paper is that evidence is assed in terms of determining the attribution of impacts and effects as one would assess the evidence in a court case (see Patton 2008 for the suggested use of this concept for advocacy.). In situations of high complexity, the 'court' would be more akin to a civil trial - judging on the balance of evidence, as opposed to a criminal court, rendering a judgement beyond a reasonable doubt.
Engaging the Client Through Contribution Analysis: Reflections on Practical Experiences and Benefits
Line Dybdal, Ramboll Management Consulting, lind@r-m.com
Sebastian Lemire, Ramboll Management Consulting, setl@r-m.com
This presentation outlines the presenters' current use of and experiences with CA in the Danish public sector. In his concept of the "embedded theory of change" (2011) Mayne addresses the underlying assumptions and risks, external factors, and principal competing explanations embedded in the program being evaluated. This is a central step in CA, as one can only infer credible and, we would argue, useful contribution stories if the embedded theory of change accounts for other influencing factors and disproves alternative explanations. However, CA in its current manifestation is not very prescriptive about how to do this from a practical perspective. First, the presenters will discuss their experiences of embedding clients in the "embedded theory of change" by engaging clients in assembling and assessing the contribution story. Second, the presenters will engage in a discussion on the client benefits of engaging in contribution analysis.

Session Title: Evaluation of STEM Programs
Multipaper Session 936 to be held in Salinas on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Assessment in Higher Education TIG
Howard Mzumara,  Indiana University Purdue University Indianapolis, hmzumara@iupui.edu
Engineering Education and Evaluation Capacity Building: An Evaluation Tools Database
Jennifer LeBeau, Washington State University, jlebeau@wsu.edu
Michael Trevisan, Washington State University, trevisan@wsu.edu
Mo Zhang, Washington State University, zhangmo@wsu.edu
Denny Davis, Washington State University, davis@wsu.edu
Abstract: Evaluation Capacity Building (ECB) is an emerging construct in evaluation, now understood as essential for ensuring high-quality evaluations are conducted and supported. While several definitions for ECB exist, each maintains components and strategies that can be employed to increase ECB in a particular field or context. In this paper, an Evaluation Tools Database is examined as a way of enhancing ECB in the field of engineering education. The database documents characteristics of evaluation tools developed and used by engineering educators over the last ten years. This paper examines the Evaluation Tools Database in relation to Stockdill, Baizerman, and Compton's (2002) conceptual definition of ECB to offer one example of an effective method for building evaluation capacity in engineering education. Stockdill, S.H., Baizerman, M., & Compton, D.W. (2002). Toward a definition of the ECB process: A conversation with the ECB literature. New Directions for Evaluation, 93, 7-25.
Assessing the Impact of Undergraduate Research and Mentoring on Student Learning in the Biological Sciences
Howard Mzumara, Indiana University Purdue University Indianapolis, hmzumara@iupui.edu
Abstract: How can we assess the impact and effectiveness of undergraduate research and mentoring programs designed to increase the number and diversity of individuals who pursue graduate studies or careers in biology? This presentation will provide participants with a preliminary report that describes ongoing efforts to assess the impact and effectiveness of an Undergraduate Research and Mentoring (URM) program targeting junior and senior undergraduate students interested in pursuing graduate degrees or careers in Biological Sciences. The study is based on a preliminary evaluation of an NSF-funded project that involved the creation and delivery of an innovative URM program focused on the 'theme of biological signaling'. The presentation will introduce participants to tools used in measuring intellectual gains as a result of participating in an intensive URM summer program. Also, participants will be engaged in an interactive discussion on viable approaches for evaluating the impact and effectiveness of URM programs in universities.
Evaluating Transferable Skills From the Capstone Experience
Mary Moriarty, Smith College, mmoriart@smith.edu
Susannah Howe, Smith College, showe@smith.edu
Abstract: Evaluation in science, technology, engineering, and mathematics education normally involves the need to satisfy multiple stakeholders. Program managers, educators, accreditation agencies, and evaluators often see evaluation from different perspectives. This paper reports on a unique evaluation of transferable skills in an engineering capstone design course conducted by an engineering faculty member and an evaluation specialist. Undergraduate engineering programs commonly culminate in a capstone design course, which meets accreditation requirements of a major design experience and provides students an opportunity to synthesize and apply previous learning as preparation for future work. Transfer activities (an initial individual written assignment and a team-based transfer map) were conducted in class with capstone students and virtually with alumni. These activities were used to promote reflection about and documentation of transferable skills and knowledge. This paper reports results, discusses the rationale behind methods, and explores the value for students, alumni, and educators.
A Teaching Career Path in Science, Technology, Engineering, and Math (STEM) Education: An Evaluation of a Noyce Teacher Recruitment Initiative in Physics and Chemistry
Meltem Alemdar, Georgia Tech, meltem.alemdar@ceismc.gatech.edu
Abstract: This paper presents an evaluation of Teacher Recruitment Initiative in Physics and Chemistry program, a collaboration between Kennesaw State University (KSU) and Georgia Institute of Technology (GT). This program is a part of Noyce Teacher Recruitment Initiative funded through the National Science Foundation. It is designed to encourage and enable GT and KSU undergraduate science and engineering majors to pursue careers in high school chemistry and physics teaching. It provides scholarships to science and engineering majors over a four year period, to be awarded during their senior undergraduate year, and during their enrollment in KSU's MAT program. This paper focuses on the program during its year of implementation, and its effects on participants. Program impact is evaluated in two areas: (1) recruitment strategies, (2) program effectiveness. Analyses is triangulated across multiple data sources (surveys, interviews, observation & documents) collected. This paper will also elaborate on critical issues in STEM evaluation.

Session Title: Before It's Too Late: Lessons Learned From Strategic Learning Evaluations of Advocacy Efforts
Panel Session 937 to be held in San Clemente on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Advocacy and Policy Change TIG
Sarah Stachowiak, Organizational Research Services, sarahs@organizationalresearch.com
Sarah Stachowiak, Organizational Research Services, sarahs@organizationalresearch.com
Huilan Krenn, WK Kellogg Foundation, hyk@wkkf.org
Abstract: Advocates for policy change - in this country and around the world - are impassioned and focused. They may resist advice about strategy from "outsiders" like evaluators. Effective formative evaluation tools and approaches can help them see the value of evaluation and improve their strategies. This session will share some methods, tools and approaches for working with clients to use evaluative data and lessons learned to adjust advocacy strategy. We will also examine how we as evaluators - in turn - adjust our approach in response to our clients' changes in strategy and in response to changes in the political or social environment. Speakers will draw on experiences in the United States, Europe and Africa helping clients and partners make mid-course corrections to their policy change efforts.
Learning as we go, but are we Going far Enough?
David Devlin-Foltz, The Aspen Institute, david.devlin-foltz@aspeninst.org
Learning as we go is APEP's mantra for our clients. Sometimes clients go along easily. Our work with the Advocacy Progress Planner in the United States, and in modified form in Tanzania and France, has yielded some encouraging examples of shared commitment to learning among funders, advocates and evaluators. We have collaborated with other clients in the U.S. on defining more precisely what kinds of changes in behavior they want to see from key policymakers and influential former officials. We have worked with clients to track their contribution to those changes in policymaker behavior over time. This too has helped clients adjust their focus and target their advocacy more precisely. Even the best benchmarks tell us nothing unless we have reliable data about progress, of course. We will discuss the challenge of identifying and using benchmarks that are both meaningful and measurable.
Strategic Learning in the Long-term: Issues and Challenges
Julia Coffman, Center for Evaluation Innovation, jcoffman@evaluationexchange.org
For almost nine years, Harvard Family Research Project has been using a strategic learning approach to evaluate the David and Lucile Packard Foundation's Preschool for California's Children grantmaking program. This year, a teaching case was developed on the evaluation of this long-term advocacy effort to highlight the issues that arise from using a strategic learning approach to evaluation in the long-term, identifying key points at which the evaluation switched course because methods were not working, or because the Foundation's strategy shifted. This presentation will highlight responses to key questions that are relevant to all strategic learning approaches, such as: How to evolve the evaluation in response to changing strategy; how to "embed" the evaluator while maintaining role boundaries; how to manage often competing learning versus accountability needs; and how to time data collection so it is rapid and "just in time" but also reliable and credible.
Evaluating for Strategic Learning Within an Education Reform Effort: Lessons Learned
Anne Gienapp, Organizational Research Services, agienapp@organizationalresearch.com
In 2009-10, with the support of the W.K. Kellogg Foundation, ORS worked with The Chalkboard Project, a funding collaborative engaging in K-12 education reform efforts in Oregon, to conduct a prospective evaluation of Chalkboard's advocacy efforts. During this period, significant changes occurred in the policy environment, necessitating adjustments in the organization's direction for policy change and an accompanying revisit of the recently developed strategic plan. This dynamic environment also required the prospective evaluation design and data collection plan to become an ongoing, iterative process. The Chalkboard experience may characterize the very nature of prospective evaluation of advocacy efforts which are most usefully conducted at a time when flexibility and rapid feedback is needed to inform strategic decision-making. This presentation will share insights and lessons learned regarding approaches for data collection, sharing results, reporting findings and how these processes strengthened the client's advocacy efforts.

Session Title: Participatory Methodologies: Innovations and Effectiveness in Impact Evaluation
Think Tank Session 938 to be held in San Simeon A on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Collaborative, Participatory & Empowerment Evaluation TIG
Cheryl Francisconi, Institute of International Education, Ethiopia, cfrancisconi@iie-ethiopia.org
Amparo Hoffman-Pinilla, New York University, ahp1@nyu.edu
Judith Kallick Russell, Independent Consultant, jkallickrussell@yahoo.com
Abstract: How can Action Research contribute to evaluation methodologies? What is the relevance of innovative and traditional methodologies for assessing the impact of programs in a particular field? In this session we will introduce the discussion topic by briefly sharing the innovative methodology of a Participatory Action evaluation conducted by New York University's Research Center for Leadership in Action of the Institute of International Education's Leadership Development for Mobilizing Reproductive Health Program based in Ethiopia, India, Nigeria, Pakistan and the Philippines. Group discussions will focus on AEA participants' experiences associated with: a) the advantages and challenges associated with a Participatory approach to evaluation; and b) the task of evaluating programs to measure their influence or impact. The session will close with a group reflection articulating lessons learned for using participatory methodologies in program evaluation.

Session Title: Evaluating Intra- and Inter-institutional Collaboration to Enhance Student Learning
Multipaper Session 939 to be held in San Simeon B on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Pre-K - 12 Educational Evaluation TIG
Tara Shepperson,  Eastern Kentucky University, tara.shepperson@eku.edu
Chad Green,  Loudoun County Public Schools, chad.green@loudoun.k12.va.us
Conceptualizing and Evaluating the Effectiveness of Intra-Institutional Collaboration: A Case Study of Purdue University's Research Goes to School Project
Kasey Goodpaster, Purdue University, scott66@purdue.edu
Omolola Adedokun, Purdue University, oadedok@purdue.edu
Sandra Laursen, University of Colorado, Boulder, sandra.laursen@colorado.edu
Lisa Kirkham, Purdue University, lkirkham@purdue.edu
Loran Parker, Purdue University, carleton@purdue.edu
Wilella Burgess, Purdue University, wburgess@purdue.edu
Gabriela Weaver, Purdue University, gweaver@purdue.edu
Abstract: Intra-institutional collaboration allows distinct entities within universities to pool their resources to broaden the impact of research. However, despite recent calls for the use of innovative collaboration to increase the relevance of research to education, especially in STEM disciplines, empirical research and evaluation of intra-institutional collaboration is scant. This gap in literature is not unrelated to the lack of a commonly agreed upon conceptualization of intra-institutional collaborations in university settings. This session will describe the process of developing a framework for the conceptualization and evaluation of an intra-institutional collaboration initiative, Purdue University's Research Goes to School project, funded by the National Science Foundation's Innovation through Institutional Integration program. In this project, three existing programs and research centers collaborate to develop, implement and test a science curriculum that brings university-level research into rural high school classrooms. Presenters will describe the process of conceptualizing, measuring, and evaluating the effectiveness of intra-institutional collaboration.
Lessons learned from evaluating a complex, multi-partner, multi-year, multi-site K-12 education intervention
Harouna Ba, Center for Children and Technology, hba@edc.org
Elizabeth Pierson, Center for Children and Technology, epierson@edc.org
Terri Meade, Center for Children and Technology, tmeade@edc.org
Abstract: In the aftermath of Hurricane Katrina, a multi-national technology corporation partnered with eight districts in Mississippi and Louisiana in order to transform them into 21st Century Learning Systems. Working closely with the funder, district leaders and various partnering organizations, EDC's Center for Children and Technology (CCT) designed and conducted the Initiative's formative and summative evaluations. Using qualitative and quantitative methods, researchers documented the experiences of leaders, technology staff, teachers, students, and parents who were instrumental in the implementation of the project. This panel will share and discuss the initiative's theories of implementation and impact, methodologies, and the lessons learned over the course of the implementation and evaluation of this complex, multi-year, multi-site, and multi-partner education intervention.
Evaluating Professional Development for Principals: The Impact of School Leadership on Teacher Collaboration and Instruction
Craig Outhouse, University of Massachusetts, craigouthouse@gmail.com
Rebecca Woodland, University of Massachusetts, Amherst, rebecca.woodland@educ.umass.edu
Abstract: This evaluation study examined the quality of a 3-year district-wide PLC professional development program for school administrators intended to build principal capacity to develop and sustain effective teacher collaboration. Findings of the evaluation, including the impact of leadership on collaboration quality, teachers' expectations to collaborate, teacher attitude toward the value of collaboration, and quality of instruction will be discussed. Stakeholders used evaluation findings to identify the high-leverage administrative behaviors that contributed most positively to improvements in teacher collaboration, instructional practice, sense of job satisfaction, and enhanced student learning. In addition to evaluation findings, session participants will learn about the key design attributes of this district's PLC professional development program for principals, and be presented with the primary assessment instrument used to collect data about school leadership, quality of teacher collaboration, instruction, and student learning.
Science, Technology, Engineering, and Math (STEM) Business Partnerships with High School Mathematics: Evaluation Towards Rigor, Relevance, Relationships, and Results for At-Risk and High-Performing Students
Paul Gale, San Bernardino County Superintendent of Schools, ps_gale@yahoo.com
Abstract: The context of the ongoing evaluation study focuses on two initiatives targeting different high school student populations. Both initiatives encompass math / science activities that are co-developed by Southern California business partners and math content experts, who work with diverse student populations from urban high schools. Each of the initiatives has professional business partners or law enforcement investigators leading students through authentic examples of the use of math and science in their careers. The ultimate goal of the initiatives is to motivate students to pursue Science, Technology, Engineering, and Math (STEM) careers from local businesses. The study addresses five implementation questions focused on students' learning, engagement, and interests. The presentation will provide a framework for the study, results, and examine issues presented by stakeholders.

In a 90 minute Roundtable session, the first rotation uses the first 45 minutes and the second rotation uses the last 45 minutes.
Roundtable Rotation I: The 'Roadmap to Effectiveness': A Discussion on Assessing Evaluation Readiness, Program Development & Valuing Multiple Voices
Roundtable Presentation 940 to be held in Santa Barbara on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Internal Evaluation TIG
Lisa M Chauveron, The Leadership Program, lisa@tlpnyc.com
Amanda C Thompkins, The Leadership Program, athompkins@tlpnyc.com
Abstract: The Leadership Program is an urban youth development organization that serves more than 18,000 youth, 6,000 parents, and 500 teachers annually. The Leadership Program encourages staff to innovate current and develop new programs to best meet participants' needs. As a result, the internal evaluation team reviews programs that run the gamut of program development and evaluation readiness, yet grants and contracts require outcomes for every program. To help our staff understand the range of evaluation options available, we created an internal tool called the Roadmap to Effectiveness, which gives voice to multiple stakeholders and identifies seven stages of program development ranging from exploratory (for pilot programs) to boxing (for programs that have been tested at scale). This roundtable session will review the Roadmap tool and discuss the process and politics of evaluating multiple programs at different points of development, and including clashing values.
Roundtable Rotation II: Ensuring Objectivity in Evaluation: Merging Continuous Quality Improvement and Outcome Evaluation Using a Triad Approach
Roundtable Presentation 940 to be held in Santa Barbara on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Internal Evaluation TIG
Tamika Howell, Harlem United Community AIDS Center Inc, thowell@harlemunited.org
Abstract: Harlem United Community AIDS Center, Inc (HU) routinely incorporates of a tri-modal approach (The Triad) to organizing and focusing on the most critical information needed to ensure objective and effective decision-making, problem solving, and program management in a setting where data is crucial to the longevity of programs and service to clients. This roundtable will present a general overview of the Triad, focusing specifically on the integration of continuous quality improvement (CQI) and outcome evaluation. The discussion will offer members of the evaluation community the opportunity to communicate their experiences, and to participate in a dialog about strategies for ensuring an objective approach to internal evaluation. Participants will also be asked to provide feedback about the Triad and offer suggestions for strengthening the link between CQI and outcome evaluation.

Session Title: Making Sense of Measures
Panel Session 941 to be held in Santa Monica on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Quantitative Methods: Theory and Design TIG
Mende Davis, University of Arizona, mfd@u.arizona.edu
Abstract: In every evaluation, there is an overabundance of measures to choose from. Which instrument should we choose? What can we use to guide our decisions? Evaluators often select instruments based on their previous experience, ready access to an instrument, its cost, whether the evaluation staff can administer the measure, and the time and effort required to collect the data. Every evaluation has a budget, in time, money, and respondent burden, and measures must fit within the budget. When inundated with measures, it is hard to see the differences between them. We suggest that a taxonomy can be used to categorize social science methods and guide a more effective selection of study measures. Some methods require considerable effort to collect, others are simple and inexpensive. Low-cost alternatives are often overlooked. In this panel, we will provide an overview of method characteristics and how to take advantage of them in evaluation designs.
Mende Davis, University of Arizona, mfd@u.arizona.edu
The first presentation will provide an overview to a taxonomic approach to methods and illustrate how it can be used to organize potential evaluation measures for health outcomes. In the process, we will examine the overlap of methods and data collection methods that may show promise for further research.
Who? Me?
Sally Olderbak, University of Arizona, sallyo@email.arizona.edu
Michael Menke, University of Arizona, menke@email.arizona.edu
Self-report is the primary workhorse of evaluation research. Ratings by others are harder to obtain but may be a better choice, when self-rating is impossible. We will review the existing literature regarding self-rating and other raters with a focus on their validity and reliability. We will present a novel empirical example of peer rating. This panel presentation will also include sources of no-cost instruments with documented reliability and validity that can be used in evaluation research.
How Many Items?
Sally Olderbak, University of Arizona, sallyo@email.arizona.edu
This presentation will focus on the literature regarding single versus multiple-item instruments and discuss the advantages and disadvantages of increasing numbers of items in terms of reliability and validity. We will illustrate this talk with the development of the Arizona Life History Battery and its short form, the 20 item Mini-K. The Arizona Life History Battery (ALHB) is a 199-item battery of cognitive and behavioral indicators of life history strategy compiled and adapted from various different original sources. The Mini-K Short Form (Figueredo et al., 2006), and may be used separately to substitute for the entire ALHB and reduce research participant response burden.
More Methods!
Michael Menke, University of Arizona, menke@email.arizona.edu
Sally Olderbak, University of Arizona, sallyo@email.arizona.edu
Mende Davis, University of Arizona, mfd@u.arizona.edu
The final presentation will demonstrate how a method taxonomy can be used to organize social science outcome measures for Alzheimer's dementia. Instruments with different names are frequently considered to be different methods; however, this is often not the case. Two multi-item paper-and-pencil tests may share all of the same method biases. If an evaluation relies on measures that are similar in nearly every way, the study results can be biased. When evaluators have the opportunity to include multiple measures, it is important to make sure that the multiple instruments are actually different. Evaluators constantly deal with cost limitations in all phases of evaluation research. We suggest that evaluators think in terms of 'item' budgets as well.

Session Title: Integrating Realist Evaluation Strategies to Achieve 100% Evaluation of all Education, Social Work, Health, Youth Justice and Other Human Services: Example of Chautauqua County, NY and Moray Council, Scotland
Demonstration Session 942 to be held in Sunset on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Human Services Evaluation TIG and the Social Work TIG
Mansoor Kazi, State University of New York, Buffalo, mkazi@buffalo.edu
Rachel Ludwig, Chautauqua County Department of Mental Hygiene, mesmerr@co.chautauqua.ny.us
Jeremy Akehurst, The Moray Council, Scotland, jeremy.akehurst@moray.gov.uk
Anne Bartone, University at Buffalo, bartonea@hotmail.com
Abstract: This demonstration will illustrate how realist evaluation strategies can be applied in the evaluation of 100% natural samples in schools, health, youth justice and other human service agencies for youth and families. These agencies routinely collect data that is typically not used for evaluation purposes. This demonstration will include new data analysis tools drawn from both the efficacy and epidemiology traditions to investigate patterns in this data in relation to outcomes, interventions and the contexts of practice. For example, binary logistic regression can be used repeatedly with whole school databases at every marking period to investigate the effectiveness of school-based interventions and their impact on school outcomes. The demonstration will include practice examples drawn from the SAMHSA funded System of Care that has enabled a 100% evaluation of over 40 agencies in Chautauqua County, New York State; and education, social work and youth justice services in Moray Council, Scotland.

Session Title: Experimental and Quasi-experimental Designs in Educational Evaluation
Multipaper Session 943 to be held in Ventura on Saturday, Nov 5, 12:35 PM to 2:05 PM
Sponsored by the Pre-K - 12 Educational Evaluation TIG
Andrea Beesley,  Mid-continent Research for Education and Learning, abeesley@mcrel.org
Eric Barela,  Partners in School Innovation, ebarela@partnersinschools.org
The Effectiveness of Mandatory-Random Student Drug Testing
Susanne James-Burdumy, Mathematica Policy Research, sjames-burdumy@mathematica-mpr.com
Brian Goesling, Mathematica Policy Research, bgoesling@mathematica-mpr.com
John Deke, Mathematica Policy Research, jdeke@mathematica-mpr.com
Eric Einspruch, RMC Research, eeinspruch@rmccorp.com
Abstract: The Mandatory-Random Student Drug Testing (MRSDT) Impact Evaluation tested the effectiveness of MRSDT in 7 school districts and 36 high schools in the United States. The study is based on a rigorous experimental design that involved randomly assigning schools to a treatment group that implemented MRSDT or to a control group that delayed implementation of MRSDT. To assess the effects of MRSDT on students, we administered student surveys at baseline and follow up, collected school records data, conducted interviews of school and district staff, and collected data on drug test results. Over 4,000 students were included in the study. The presentation will focus on the study's findings after the MRSDT programs had been implemented for one school year.
Striving for Balance: The Value of Publishing Rigorous Studies with Insignificant Findings
Jill Feldman, Research for Better Schools, feldman@rbs.org
Debra Coffey, Research for Better Schools, coffey@rbs.org
Ning Rui, Research for Better Schools, rui@rbs.org
Allen Schenck, RMC Corporation, schencka@rmcarl.com
Abstract: Impact analyses from a four-year experimental study of using READ 180, an intervention targeting adolescent struggling readers, showed no differences in reading performance between students assigned to treatment or control. Furthermore, results were consistent when data were analyzed by grade level, regardless of whether students had one or two years of treatment in any study year. In a related study, researchers explored whether students in different ability subgroups benefited from participation in READ 180. In addition to the lack of significant findings from the RCT, results failed to reveal subgroups of students for whom READ 180 worked better than the district's regular instruction. The presentation will conclude with a discussion about the value to research consumers of publishing rigorously designed studies when findings suggest a program does not work better than current practices, especially when the stakes for students and for society are high.
How to Train Your Dragon: One Story of Using a Quasi-Experimental Design Element in a School-Based Evaluation Study
Tamara M Walser, University of North Carolina, Wilmington, walsert@uncw.edu
Michele A Parker, University of North Carolina, Wilmington, parkerma@uncw.edu
Abstract: Given current requirements under No Child Left Behind (NCLB) for the implementation of educational programs supported by scientifically based research and the U.S. Department of Education's related priority for experimental and quasi-experimental studies in federal grant competitions, it is important that education evaluators (a) identify evaluation designs that meet the needs of the current NCLB climate, (b) address evaluation questions of causal inference, and (c) implement designs that are feasible and ethical for school-based evaluation. The purpose of this presentation is to describe the use of a quasi-experimental design element as part of a larger longitudinal evaluation study. Specifically, we focus on implementation of the design element including issues encountered; the success of the design in terms of internal and external validity; and lessons learned and related implications for school-based evaluation practice, as well as evaluation practice in general.

Return to Evaluation 2011
Search Results for All Sessions