Evidence-based Software Engineering PDF Evidence based software engineering ResearchGate researchgate Evidence based software engineering Evidence based software engineering pdf PDF Evidence based software engineering IEEE Xplore ieeexplore ieee iel5 9201 29176 01317449
PDF Delft University of Technology Evidence Based Software Portfolio pure tudelft nl ws files 37694615 Thesis Hennie Huijgens final pdf PDF Evidence Based Software Portfolio Management CEUR Workshop ceur ws Vol 1469 paper4 pdf PDF Evidence Based Software Engineering And Systematic
PDF Evidence for adaptive changes in egg laying in Semantic Scholar pdf s semanticscholar 8b20e6f99e826ed7185fec2a37aab899e988 pdf PDF Parasitic manipulation Mivegec IRD mivegec ird images stories PDF files 0219
7 Sep 2010 syrup in food for seasoning and they used no salt for preserving food first recorded evidence is from 1623 when Étienne Brulé, who was cruel and deadly lesson of the Kinmount and Muskoka catastrophe Þorvaldsson
ótica da evidenciação voluntária dos elementos do Capital Intelectual nos Relatórios da Administração disponibilizados pelas companhias, utilizando como Resumo O Capital Intelectual (CI), nos últimos anos, tornou se mais relevante para as organizações É notório que sua evidenciação interfira como nova fonte Evidenciação do Capital
PDF Condiciones del Contexto para el Emprendimiento Negocios UDDnegocios udd cl files 2011 05 GEM 5 AÑOS completo FINAL pdf PDF 2 principales aportes teóricos sobre emprendimiento Municipiosmunicipios unq edu ar modules MonografiaVersionFinal pdf PDF Emprendimiento e
La medicina actual está cambiando drásticamente, en los últimos años, durante exposiciones científicas se escucha con frecuencia la expresión “la evidencia evidencia y la importancia que ha tenido sobre la medicina conductual habilidades para la búsqueda y aplicación de la evidencia en el caso en turno,
7 Sep 2007 Medicina Basada en la Evidencia o en Pruebas • Uso consciente, explícito Los resúmenes de la “web temática de la espalda” de la Fundación Kovacs Guías clínicas en Fisterra, fisterra guias2
4 Feb 2010 Evidencias en el tratamiento con hormona del crecimiento Nuevas indicaciones Emilio García García Unidad de Endocrinología Pediátrica La Hormona de crecimiento (GH) es la hormona de la glándula pituitaria En efecto hay evidencia de que la terapia GH
PDF Evidenzbasierte Leitlinie der Deutschen Diabetes Gesellschaft deutsche diabetes gesellschaft de Leitlinien Evidenzbasierte Leitlinien EBL Alter 2004 pdf PDF S3 Leitlinie Therapie des Typ 1 Diabetes, 2 Auflage Deutsche deutsche
Software Engineering Barbara A.
Kitchenham 1,3 Tore Dybå 2,4 Magne Jørgensen 2 1 National...
Evidence-based Software Engineering Barbara A.
Magne Jørgensen2 [email protected]
Norway 3 Dept.
UK 4 Dept.
Abstract Objective: Our objective is to describe how software engineering might benefit from an evidence-based approach and to identify the potential difficulties associated with the approach.
Method: We compared the organisation and technical infrastructure supporting evidence-based medicine (EBM) with the situation in software engineering.
We considered the impact that factors peculiar to software engineering (i.e.
the skill factor and the lifecycle factor) would have on our ability to practice evidence-based software engineering (EBSE).
Results: EBSE promises a number of benefits by encouraging integration of research results with a view to supporting the needs of many different stakeholder groups.
we do not currently have the infrastructure needed for widespread adoption of EBSE.
The skill factor means software engineering experiments are vulnerable to subject and experimenter bias.
The lifecycle factor means it is difficult to determine how technologies will behave once deployed.
Conclusions: Software engineering would benefit from adopting what it can of the evidence approach provided that it deals with the specific problems that arise from the nature of software engineering.
medical research had changed dramatically as a result of adopting an evidence-based paradigm.
studies showed on the one hand that failure to organise medical research in systematic review could cost lives  and on the other hand that the clinical judgement of experts compared unfavourably with the results of systematic reviews .
Since the publication of these influential papers,
medical researchers have adopted the evidence-based paradigm.
the number of articles about evidence-based practice has grown from 1 publication in 1992 to about a thousand in 1998 and international interest has led to the development of 6 evidence-based journals specialising in systematic reviews.
The success of evidence-based medicine has prompted many other disciplines that provide services to,
members of the public to attempt to adopt a similar approach,
including for example psychiatry1,
We do not suggest that software engineers should adopt a new practice just because “everyone else is doing it”,
particularly since the evidence-based movement has its critics.
Hammersley points out that research is fallible,
relies on generalisations that may be difficult to interpret,
and is often insufficient for determining appropriate means for delivering best practice .
we believe that a successful innovation in a discipline that,
attempts to harness scientific advances for the benefit of society,
in this paper we discuss the possibility of evidence-based software engineering using an analogy with medical practice.
We also identify two areas where the analogy with medicine breaks down.
This allows us to identify a number of serious problems that need to be resolved if EBSE is to become a reality.
www.med.nagoya-cu.ac.jp/psych.dir/ebpcenter.htm www.york.ac.uk/healthsciences/centres/evidence/cebn.htm 3 www.evidencenetwork.org 4 cem.dur.ac.uk/ebeuk/EBEN.htm 2
Why evidence is important in software engineering
practical experience and human values in the decision making process regarding the development and maintenance of software.
Initially it is worthwhile considering why evidence would be beneficial to software developers,
users and other stakeholders e.g.
certification bodies and the general public.
EBSE is potentially important because of the central place software intensive systems are starting to take in everyday life.
current plans for advanced life-critical systems such as drive-by-wire applications for cars and wearable medical devices have the potential for immense economic and social benefit but can also pose a major threat to industry,
If systems are reliable,
the quality of life of individual citizens will be enhanced.
there are far too many examples of systems that have not only wasted large amounts of public money but have also caused harm to individual citizens (e.g.
the automated command and control system for the London Ambulance Service).
Individual citizens have a right to expect their governments to properly administer tax revenues used to commission new software systems and put in place controls to minimise the risk of such systems causing harm.
There are many strategies to improve the dependability of software involving the adoption of “better” software development procedures and practices.
the Capability Maturity Model and SPICE suggest procedures for improving the software production process.
the professional bodies are establishing procedures for certification of individual software engineers.
the high level process and the individual engineers are constrained by the specific technologies (methods,
tools and procedures) they use.
In most cases software is built with technologies for which we have insufficient evidence to confirm their suitability,
it is difficult to be sure that changing software practices will necessarily be a change for the better.
It is possible that EBSE can provide the mechanisms needed to assist practitioners to adopt appropriate technologies and to avoid inappropriate technologies.
The goal of EBSE The goal of evidence-based medicine (EBM) is “the integration of best research evidence with clinical expertise and patient values” .
we suggest that the goal of evidence-based software engineering (EBSE) should be: to provide the means by which current best evidence from research can be integrated with
• A common goal for individual researchers and research groups to ensure that their research is directed to the requirements of industry and other stakeholder groups.
• A means by which industry practitioners can make rational decisions about technology adoption.
• A means to improve the dependability of software intensive systems,
as a result of better choice of development technologies.
• A means to increase the acceptability of softwareintensive systems that interface with individual citizens.
• An input to certification processes.
Practising EBSE Sackett et al.
Although there are some detailed medical references,
it is easy to reformulate the steps to address evidence-based software engineering (see column 3 of Table 1).
we would hazard a guess that at least part of the attraction that evidence-based medicine has for other disciplines is the ease with which the basic steps can be adapted to other fields.
it is important to remember that even if high-level process steps for evidence-based practice appear to be similar for medicine and software engineering,
this does not guarantee that the underlying scientific,
technological and organisational mechanisms that support evidence-based medicine apply to evidence-based software engineering.
individual developers seldom have the option to pick and choose the technologies they are going to use.
Technology adoption is often decided either by project managers on a project by project basis,
or by senior managers on a departmental or organizational basis.
our concern is not usually the specific task to which the technology is applied but the outcome of the project of which the task is a part.
Five steps used in Evidence-based Medicine and (by analogy) in Evidence-based Software Engineering.
etc) into an answerable question.
Tracking down the best evidence with which to answer that question.
and applicability (usefulness in our clinical practice).
management procedures etc.) into an answerable question.
and applicability (usefulness in software development practice).
Integrating the critical appraisal with our software engineering expertise and with our stakeholders’ values and circumstances.
Evaluating our effectiveness and efficiency in executing Steps 1-4 and seeking ways to improve them both for next time.
Integrating the critical appraisal with our clinical expertise and with our patient's unique biology,
Evaluating our effectiveness and efficiency in executing Steps 1-4 and seeking ways to improve them both for next time.
in addition to the viewpoint of the individual practitioner,
there are two other viewpoints that are important for EBSE in practice: 1.
That of a project manager who wants to achieve a favourable outcome for a particular project.
That of a senior manager who wants to improve the performance of a department or organization as a whole.
we believe EBSE (and indeed EBM) place requirements on researchers: • To improve the standard of individual empirical studies and systematic reviews of such studies.
• To identify outcome measures that are meaningful to practitioners.
• To report their results in a manner that is accessible to practitioners.
• To perform and report replication studies.
Defining an answerable question Sackett et al.
The study factor (e.g.
The population (the disease group or spectrum of the well population).
so they usually concentrate on the first two parts.
This would be the same for EBSE.
EBM researchers point out that it is important that the question is broad enough to allow examination of variation in the study factor and across populations.
the study factor would be the technology of interest.
The technology should not be specified at too high a level of abstraction e.g.
but must be general enough to identify the majority of relevant empirical studies,
For some questions it may be necessary to be even more precise e.g.
or Statistically-derived estimation models.
It is even more difficult to determine the correct level of abstraction for specifying the population of interest.
The population of interest may be categorised in many dimensions based on experience of technology users,
types of problem addressed by the technology,
even fairly broad categories may be counter-productive if useful empirical evidence is lost by restrictions imposed by such categorisation.
Finding the best evidence One of the reasons for formulating the question precisely is to help researchers and practitioners to find all relevant studies.
According to the Australian National Health and Medical Research Council,
there are over 20,000 journals in the biomedical field.
A major problem for EBM is finding relevant papers from the massive amount of published work.
papers that have already assembled all relevant reports on a particular topic.
They use the question of interest to construct search strings aimed at finding relevant individual studies.
they have a large amount of technological and scientific infrastructure to support them: • There are several organisations (in particular the international Cochrane Collaboration,
see www.cochrane.com) that assemble systematic reviews of studies of drug and medical procedures.
To provide a central information source for evidence,
the Cochrane Collaboration publishes systematic reviews in successive issues of The Cochrane Database of Systematic Reviews.
These reviews are continually revised both as new experimental results become available and as a result of valid criticisms of the reports.
The Cochrane Collaboration actively solicits comments on their reports (subject to published house rules).
• Some countries have established central abstracting services for medical research papers.
which provides references and abstracts from 4600 biomedical journals.
• To reduce the problem of “publication bias”,
the Cochrane Collaboration provides a database for researchers to register that they are intending to perform a controlled trial.
Publication bias is the phenomena that more “positive” results are published than “negative” results.
This can lead to an overestimation of the effect size in systematic reviews and an under-reporting of risks.
The Cochrane Collaboration Groups use the register to follow up all trials whether or not they are published.
Although we have no equivalent to the Cochrane Collaboration,
there are many abstracting services that provide access to software engineering articles.
Organizations such as the IEEE,
with its database IEEE Xplore,
with its Digital Library provide access to databases of articles.
and keywords and usually have links to abstracts and sometimes access to the original articles.
Such indexing makes it easier to search for information regarding a problem area or find an answer to a specific question.
Critically appraising the evidence The work of the Cochrane Collaboration and other national medical organisations (e.g.
the Australian National Health and Medical Research Council) has
radically changed the nature of medical research.
Medical research has recognised that single studies (even the most rigorous double-blinded randomised controlled trials,
The emphasis now is on the accumulation of evidence from many independent experiments.
Critical appraisal in EBM has been supported by improved methodology both for systematic reviews and individual studies: • Several organisations have produced guidelines for systematic reviews and evaluating evidence.
The Australian National Health and Medical Research Council publish a series of more general handbooks that consider experimental methods other than just RCTs (see www.health.gov.au/mrc).
Importantly the Australian NHMRC makes a distinction between collating experimental evidence and packaging the evidence into tailored guidelines for various stakeholders.
• Medical journals have pressed for improvements in the conduct and reporting of individual experiments.
A particular example is the CONSORT statement,
which defines the standards for randomised,
This statement has been adopted as the standard for reporting RCTs by all the most important medical journals.
Currently evidence related to software engineering technologies that is available is: • Fragmented and limited.
Many individual research groups undertake valuable empirical studies.
because the goal of such work is either individual publications and/or post-graduate theses,
there is sometimes little sense of overall purpose to such studies.
it is easy for researchers to undertake research in their own areas of interest rather than contribute to a wider research agenda.
there are no agreed standards for systematic reviews.
although most PhD students undertake reviews of the “State of the Art” in their topic of interest,
the quality of such reviews is variable,
and they do not as a rule lead to published papers.
There is little appreciation of the value of systematic reviews,
there is only one Computing journal that solicits reviews (ACM Surveys).
if we consider “meta-analysis”,
which is a more statistically rigorous form of systematic review,
there have been few attempts to apply meta-analytic
techniques to software engineering not least because of the limited number of replications.
In general there are few incentives to undertake replication studies in spite of their importance in terms of the scientific method .
There are no generally accepted guidelines or standard protocols for conducting individual experiments.
The recent dispute between Berry and Tichy  and Sobel and Clarkson  over the conduct of an experiment into formal methods  makes it clear that empirical software engineering is badly in need of guidelines and protocols.
Kitchenham et al.
they do not address observational,
because they attempt to address several different types of empirical study,
the guidelines are not as specific,
nor as detailed as the CONSORT statement.
a doctor is expected to relate evidence to the needs of the specific patient.
the advice given to a patient with a particular disease may differ according to his/her age and gender and the severity of the symptoms he/she displays.
Although there are opportunities for individual software engineers and managers to adopt EBSE principles,
the decision to adopt a technology is often an organisational issue that is influenced by factors such as the organizational culture,
the experience and skill of the individual software developers,
and the extent of training required.
to use EBSE in practice may be more demanding than EBM because the decision-making structure and adoption process is often more complex.
We believe that EBSE would work well in an organization that has a strong commitment to process improvement,
based on the recommendations in .
currently this does not appear to be happening.
Research results are: • Not in wide-spread use in industry.
In our opinion,
researchers often address issues that are not perceived to be of relevance to industry or present their results in a way that is virtually incomprehensible to decision makers in industry.
• Not of perceived value to stakeholders.
and consumer groups should all be concerned about the quality of the techniques used to build software products.
of software intensive products would be substantially undermined if they were aware that the choice of development techniques is based on fashion and hype rather than scientific evidence.
this would involve propagating successful technologies throughout a company and preventing the spread of technologies that are unsuccessful.
This concept fits well with the goals of software process improvement.
there is a broader level of feedback in medicine.
individual doctors are responsible for reporting unanticipated side-effects of drugs.
This contributes to the evidence associated with a particular treatment and may lead to further basic research.
there is little incentive for individual companies to assist their competitors by reporting good and bad experiences with new technologies.
it is difficult for experiences with a technology to be disentangled from the particular context in which it was used.
Implications for EBSE It is clear that a full-scale implementation of EBSE is an extremely ambitious goal.
It cannot be achieved without extensive collaboration and long-term commitment among individual research groups worldwide,
and active support from other stakeholders such as practitioners in industry,
Furthermore it cannot be achieved without initial financial support from research funding agencies to enable the basic technological and methodological infrastructure to be established.
It is clear that individual practitioners and researchers can use some of the ideas of EBSE without extensive technical support.
Sackett et al.
suggest that the support infrastructure is a major reason for the widespread adoption of the evidence-based paradigm in medicine .
technological and organisational support for evidence-based software engineering could be put in place.
such support would be of little value if there were fundamental differences between medicine and software engineering
that would make evidence-based software engineering difficult or impossible.
The skill factor One area where there is a major difference between medicine and software engineering is that most software engineering methods and techniques must be performed by skilled software practitioners.
although medical practitioners are skilled individuals,
the treatments they prescribe (e.g.
medicines and other therapeutic remedies) do not usually require skill to administer or to receive.
it is noticeable that evidence-based surgery,
which is far more analogous to software engineering than other types of medical practice,
is far less advanced than evidence-based medicine .
The reason why skill presents a problem is because it prevents adequate blinding.
the gold standard experiment is a double-blind randomised controlled trial (RCT).
In a double-blind experimental trial neither the doctor nor the patient knows which treatment the patient is receiving.
The reason double-blinded trials are required is to prevent patient and doctors expectations biasing the results.
Such experimental protocols are impossible in software engineering experiments that rely on a subject performing a human-intensive task.
We can accept that our experiments are bound to be less rigorous than medical trials and attempt to qualify our experiments appropriately.
Although we cannot usually blind experimenters or subjects,
we can use blinding in a number of ways to reduce the opportunity for bias by reducing the direct interaction between subjects and experimenters during the course of an experiment : • Blind allocation to treatment groups.
computerised methods can be used to automate the random allocation of subjects to each technique..
• Blind distribution of material.
Linked to blind allocation,
computers can be used to distribute experimental materials to subjects.
• Blind or automated marking.
If the task results cannot be linked directly to the treatment e.g.
where the subjects are asked to answer some questions that test their understanding of a software document,
the marker(s) should be blind to which treatment was used by the subjects (i.e.
not identify the subject or the treatment to the marker).
Computerised systems can be used when information is required from subjects.
Subjects will usually be more accurate in reporting information to a computer system,
particularly if they are guaranteed anonymity.
we need to ensure that experimental designs allow for systematic subject difference due to skill,
or cross-over designs (where each subject acts as their own control).
we should encourage replication studies by experimenters who have no vested interest in the outcome of the study.
we need to make sure replications are not too similar.
It is important to vary experimental designs and experimental materials to avoid the risk of any common cause bias in replications .
it is recognised that it is sometimes impossible to perform randomised trials and evidence from other types of experiment may need to be considered.
The Australian National Health and Medical Research Council have published guidelines for evaluating the quality of evidence .
This has three elements: Level,
Level relates to the choice of study design and is used as an indicator to which bias has been eliminated by design.
Statistical Precision refers to the Pvalue or the confidence interval.
The distance of the estimated treatment effect from the null value and the inclusion of clinically important effects in the confidence interval.
The usefulness of the evidence in clinical practice,
particularly the appropriateness of the outcome measures used.
The lifecycle issue The other major difference between software engineering and medicine is that most software engineering techniques impact a part of the lifecycle in a
way that makes the individual effect of a technique difficult to isolate: • They interact with many other development techniques and procedures.
For example a design method depends on a preceding requirements analysis.
It must consider constraints imposed by the software and hardware platform and programming languages,
it would be difficult to confirm that a design technique had a significant impact on final product reliability.
it is difficult to determine a causal link between a particular technique and a desired project outcome when the application of the technique and the final outcome are temporally removed from one another,
and there are many other tasks and activities that could also affect the final outcome.
• The immediate outcomes of a software engineering technique will not necessarily have a strong relationship with final project outcomes.
if you are interested in the effect design techniques have on application reliability (i.e.
probability of failure in a given time period under defined operational conditions),
measures of the design product (or design process) have no obvious relationship with the desired outcome.
There are no good surrogate measures of product reliability that can be measured at the end of the design process.
There seem to be two major approaches to this issue: 1.
We can experiment with individual techniques isolated from o