CERN School of Computing 2003

Scientific programme

The School is based on the presentation of approximately 29 lectures and on 24 hours of related practical exercises on PCs or workstations.

The programme of the Schools is organised round three themes (tracks):

·         Algorithms

The objective of this track is to make the students familiar with state-of-the-art algorithms used for event selection and event reconstruction. At LHC, event selection is a very important topic, because of the high event rate and the small signal to background ratio. An efficient and powerful selection procedure has to rely on an at least partial reconstruction of the event; reconstruction algorithms therefore play an important role already at this stage. The first lecture series explains the principles of event selection and gives an overview of the algorithms that are used in this context. The details of some of these algorithms are explored in the exercises.  If an event is selected, full reconstruction is launched.   In this stage the emphasis is on the best possible precision that can be obtained. It has recently been shown that adaptive algorithms of track reconstruction can cope very well with the high background that is present in the LHC track detectors. The second lecture series explains the concept of adaptive methods, explores their relation to general combinatorial optimization problems, and shows that they can be implemented as iterated classical (least-squares) estimators. The most important features of adaptive methods are demonstrated by solving some basic problems in combinatorial optimization in the exercises. The third lecture series deals with the task of vertex reconstruction, which consists of assigning tracks to common production points (vertices) and of finding optimal estimates of vertex and track parameters. After an exposition of the traditional methods, applications in several experiments are presented and discussed critically.  Finally, it is shown that robust and adaptive methods are important also in vertex reconstruction. The adaptive method is extended to deal with several vertices concurrently, allowing dynamical swapping of tracks between vertices.

·         Grid Technologies

The Grid Track covers several aspects of Grid computing and provides the ability to get hand-on experience with modern Grid tools. A major part will be dedicated to Grid software that has been produced within the the EU  DataGrid project (EDG). A special high light is the presence one of the people who created the Grid concept.  EDG has as its aim to develop a large-scale research testbed for Grid computing. The project is in its final phase and a testbed spanning some ten major sites all over Europe has been up and running continuously since the beginning of 2002. Three application domains are using this testbed to explore the potential that Grid computing has for their production environments: Particle physics, Earth observation and Biomedics.  The tutorials will present the EDG software architecture and discuss the interplay of basic grid software (Globus, Condor), higher level EDG middleware, and application software on the EDG testbed. Emphasis will be put on specific middleware issues in job submission and data management as well as on EDG's security architecture. In several exercises students will learn how to use Grid tools for their distributed data or computing intensive applications.

·         Software Technologies

This track presents modern techniques for software design and modern tools and technologies for understanding and improving existing software. The emphasis will be placed on the large software projects and large executables that are common in HEP. The track will consist of lectures, exercises and discussions. The first discussion session will occur after several hours of exercises have been completed. The last discussion session will be held at the end of the track.
The first 2 lectures cover a series of tools and techniques which will be exemplified during the initial series of exercises. This will be followed later in the week by 3 lectures on software engineering, design, methodology and testing. The second series of three lectures will
address issues when working with large software systems, including methods for analyzing their structure and improving it. The final 3 lectures will focus on a number of technologies that are relevant and commonly used for building interactive and distributed applications.

In the exercise sessions, the students will have a chance to use the tools that are described in the lectures. They will work with CVS and configuration management tools. They will be asked to use the test and debugging tools on some simple examples. By showing how these tools can locate known problems, students will learn how to use them on new problems. Students will then be given a functional program and a brief description of what it does. The goal is to extend the program to handle a larger problem domain. It is expected that the example programs and exercises will be primarily in C++. In the final exercise sessions, the students will practice the new technologies introduced with examples. For these exercises the Python language will be used.
 

 

The following lectures are foreseen:

Algorithms

AL1  Algorithms for online event selection at the LHC

This lecture series will cover the concept of online event selection ("triggering") at LHC.  One of the prime challenges presented to the LHC experiments consists in selecting the very few interesting events that may hint to new physics or permit the measurement of fundamental parameters amidst the much more copious occurrence of well-understood and studied processes. Out of a billion interactions per second, no more than a hundred can be stored for further analysis. After discussing the design and implementation of online event selection strategies of the LHC experiments, we will describe reconstruction algorithms based on the concept of regional/partial reconstruction and their applications to event selection based on physics objects (jets, muons, electrons, etc.). Special emphasis will be given to tracking algorithms such as the Kalman Filter.

Lectures:

AL-ES-L-1   Introduction to online event selection at the LHC                     N. Neumeister/T. Todorov

AL-ES-L-2  Regional and partial event reconstruction                                   N. Neumeister/T. Todorov

AL-ES-L-3  Reconstruction of physics objects                                              N. Neumeister/T. Todorov

AL-ES-L-4  Algorithms for track reconstruction                                           N. Neumeister/T. Todorov

Exercises:

 

AL-ES-E     The use of the algorithms discussed in the lectures will be

                   demonstrated with real world examples.                                    N. Neumeister/T. Todorov

 

AL2    Adaptive methods with application to track reconstruction at LHC

 

The task of reconstructing particle tracks in a noisy environment leads to a combinatorial optimization problem that cannot be solved in an exhaustive way. The series will give an introduction into adaptive methods of solving such problems, which occur also in other fields like cluster analysis, image processing, or operations research. In the context of track reconstruction the adaptive solution can be formulated as an iteratively reweighted least-squares procedure (Kalman filter) with annealing. Examples of the performance of this algorithm to reconstruction problems in the ATLAS Inner Detector and the CMS Tracker will conclude the lecture series.

 

AL-AM-L-1  Introduction to adaptive methods.                                                A. Strandlie

AL-AM-L-2 Adaptive methods in track reconstruction at LHC.                        A. Strandlie

Exercises:

AL-AM-E-1 and 2

                The most important concepts presented in the lecture series will

                 be exemplified by basic but nevertheless representative problems.

                 The solutions will show the benefit of the adaptive approach even

                 in these simple cases.                                                                       A. Strandlie

 

AL3 Vertex reconstruction

 

Vertex reconstruction follows track reconstruction and determines the interaction points at which the tracks have been produced. In the first part of the lecture series the basic algorithms are reviewed, and examples in different experimental settings are presented. The second part deals with methods of robustification, including an adaptive method closely related to the one used in track reconstruction. Finally, it is shown how the adaptive method can be generalized in order to reconstruct several vertices concurrently, allowing swapping of tracks between them.

 

AL-VR-L-1  Basic algorithms for vertex reconstruction,

                  experimental applications                                                             M. Regler 

 

AL-VR-L-2  Robust and adaptive algorithms for vertex reconstruction        M. Regler

 

Grid Technologies

Lectures:

Grid Technologies                                                                                         

The Grid Technologies series of lecturers covers several aspects of Grid computing and provides the ability to get hand-on experience with modern Grid tools. A major part will be dedicated to Grid software that has been produced within the the EU  DataGrid project (EDG).  EDG has as its aim to develop a large-scale research testbed for Grid computing. The project is in its final phase and a testbed spanning some ten major sites all over Europe has been up and running continuously since the beginning of 2002. The tutorials will present the EDG software architecture and discuss the interplay of basic grid software (Globus, Condor), higher level EDG middleware, and application software on the EDG testbed. Emphasis will be put on specific middleware issues in job submission and data management as well as on EDG's security architecture. In several exercises students will learn how to use Grid tools for their distributed data or computing intensive applications. In addition, two optional hours of exercises will be proposed to students, focusing on network performance aspects of the Grid.

GT-GT-L-1     Lecture 1                                                                             E. Laure, H. Stockinger, K. Stockinger
GT-GT-L-2     Lecture 2
GT-GT-L-3     Lecture 3
GT-GT-L-4     Lecture 4
GT-GT-L-5     Lecture 5
GT-GT-L-6     Lecture 6
 

The Open Grid Services Architecture and the Globus Toolkit:

Status and Future Directions                                                                      

The Open Grid Services Architecture (OGSA) is a standards based foundation for building service based Grid infrastructure and applications. OGSA provides a uniform foundation on which robust and scalable Grid services can be constructed. Because of these advantages, the newest version of the Globus Toolkit (RM), is based on OGSA, making GT3 an open source reference implementation of the Open Grid Services Infrastructure (OGSI) core behaviors, as well as fundamental Grid functionality. In this two talks, I will give an overview of OGSA and describe its implementation in GT3. I will also outline current directions in both OGSA, and the development of the Globus toolkit.

GT-FG-L-1         Lecture 1                                                                         C. Kesselman
GT-FG-L-2         Lecture 2

 

Networking QoS Basics (optional) (2 hrs)                                                 F. Flückiger

Exercises:

GT-GT-E-1     Exercise on Grid Technologies                                            E. Laure, H. Stockinger, K. Stockinger
GT-GT-E-2     Exercise 2
GT-GT-E-3     Exercise 3
GT-GT-E-4     Exercise 4
GT-GT-E-5     Exercise 5
GT-GT-E-6     Exercise 6
                                                                               

Mini-Project (2 hrs)                                                                                    E. Laure, H. Stockinger, K. Stockinger

Wrap-Up (2 hrs)                                                                                          E. Laure, H. Stockinger, K. Stockinger

Exercises:

GT-NP-E-1     Exercise on Network Performance                                     E. Laure, H. Stockinger, K. Stockinger
GT-NP-E-2     Exercise on Network Performance
                                            

Software Technologies

Lectures:

Tools and Techniques                                                      

These lectures present tools and techniques that are valuable when developing software for high energy physics.  We discuss how to work more efficiently while still creating a high quality product that your colleagues will be happy with. The exercises provide practice with each of the tools and techniques presented, and culminate in a small project.

ST-TT-L-1   Tools                                                                               R.G. Jacobsen
ST-TT-L-2   Techniques
                                                                     R.G. Jacobsen
 

Software Engineering

An introduction to the principles of Software Engineering, with emphasis on what we know about building large software systems for high-energy physics. These lectures cover the principles of software engineering, design, methodology and testing.

ST-SE-L-1     Introduction to Software Engineering                       R.G. Jacobsen
ST-SE-L-2     Software Design                                                         R.G. Jacobsen
ST-SE-L-3     Long-term Issues of Software Building                     R.G. Jacobsen

                                                          

ST-AS-L-1    Object oriented programming with Java and C++          P. Tonella

ST-AS-L-2    Refactoring                                                                   P. Tonella

ST-AS-L-3    C++ code analysis and verification of coding

                      conventions                                                                   P. Tonella

 

ST-IC-L-1     Python scripting language                                               A. Pfeiffer
In the first session an introduction to Python, an object-oriented interpreted language will be given. After an introduction into the syntax of Python including data types, functions, control structures (like if, for, try) the aspect of loadable modules will be presented. An introduction into basic File I/O and object persistency
in Python will be followed by an introduction to classes, inheritance and other object-oriented aspects of Python. The session will finish with an overview of ways to extend Python through extensions written in other languages such as C and C++.


ST-IC-L-2     XML and related technologies (XSLT, ...)                        A. Pfeiffer
The second session will introduce XML, the eXtensible Markup Language, as a standard for document markup. XML defines a generic syntax to mark up data with simple, human readable tags; thus providing a flexible standard format for a variety of application domains. The session will introduce the fundamental entities of XML, ranging from elements and attributes, through namespaces to Document Type Definitions (DTDs) and the definition of a valid XML document. Based on this, the session will then look into the area of styling XML documents using the eXtensible Stylesheet Language (XSL) in it's forms of XSLT and XSL-FO. Extensions to XML like XPath, XLink and XPointer will be discussed in the context of Data-centric Documents. The session will conclude with an overview of the DOM and SAX models of representing the structure of an XML document.


ST-IC-L-3    Distributed computing technologies

                      and protocols (SOAP, XMLRPC, Web services, ...)            A. Pfeiffer
Web services provide a standard means of communication among different software applications, running on a variety of platforms and/or frameworks. We will cover the definitions of the World Wide Web Consortium (W3C) of Web Services and give an overview of their architecture. We show how XML is used in the context of web services and discuss two major protocols to interchange data between applications/web-services: XML-RPC and SOAP. The various roles which software agents can have in the basic architecture (Service requestor, Service provider and Discovery agency) are discussed as well as model and XML format for describing Web services like the Web Service Description Language (WSDL) and . WSDL enables one to separate the description of the abstract functionality offered by a service from concrete details of a service description such as "how" and "where" of that functionality.

Exercises:

ST-TT-E-1     Exercises on Tools and Techniques                              R.G. Jacobsen
ST-TT-E-2     Exercises on Tools and Techniques
ST-TT-E-3     Exercises on Tools and Techniques
ST-TT-E-4     Exercises on Tools and Techniques
                                                              

ST-AS-E-1     Exercises on System Analysis                                      P. Tonella
ST-AS-E-2     Exercises on System Analysis
                                                                       

ST-IC-E-1      Exercises on Technologies for Interactive & Distributed Computing     A. Pfeiffer
ST-IC-E-2      Exercises on Technologies for Interactive & Distributed Computing
         

Lecturers

F. Flückiger, CERN, Geneva, Switzerland

François Flückiger, Director of the CERN School of Computing, is Technology Transfer Officer for Information Technologies at CERN and Associate head of the CERN openlab for DataGrid applications. He is also an adjunct professor of Computer Science at the University of Geneva.  Before joining CERN in 1978, he was employed for five years by SESA in Paris.  At CERN, he has been in charge of external networking for more than 12 years and held positions in infrastructure and application networking.  He is an adviser to the European Commission, a member of the Internet Society Advisory Council and the author of the reference textbook "Understanding Networked Multimedia" as well as more than 80 articles.  He has 30 years of experience in networking and information technologies. François Flückiger graduated from the Ecole Supérieure d'Electricité in 1973 and holds an MBA from the Enterprise Administration Institute in Paris in 1977.
 

R.G. (Bob) Jacobsen, University of California, Berkeley, USA

Bob Jacobsen is an experimental high-energy physicist and a faculty member at the University of California, Berkeley.  He's a member of the BaBar collaboration, where he led the effort to create the reconstruction software and the offline system.  He has previously been a member of the ALEPH (LEP) and MarkII (SLC) collaborations. His original academic training was in computer engineering, and he worked in the computing industry before becoming a physicist.

R. (Bob) Jones, CERN, Geneva, Switzerland

After studying computer science at university Bob joined CERN and has been working on online systems for the LEP and LHC experiments. Databases communication systems graphical user interfaces and the application of these technologies to data acquisition systems was the basis of his thesis. He is currently responsible for the control and configuration sub-system of the ATLAS data acquisition prototype project.

Carl Kesselman, University of South California, Marina del Rey, USA

Dr. Carl Kesselman is a Senior Project Leader at the University of Southern California's Information Sciences Institute and a Research Associate Professor of Computer Science, also at the University of Southern California.  Prior to joining USC, Dr. Kesselman was a Member of the Beckman Institute and a Senior Research Fellow at the California Institute of Technology.  He holds a Ph.D. in Computer Science from the University of California at Los Angles.  Dr. Kesselman's research interests are in high-performance distributed computing, or Grid Computing.  He is the Co-leader of the Globus project, and along with Dr. Ian Foster, edited a widely referenced text on Grid computing.

Erwin Laure, CERN, Geneva, Switzerland

Dr. Erwin Laure received his Ph.D degree in Business Administration and Computer Science in 2001 from the University of Vienna.  After working as a research assistant at the Institute for Software Science of the University of Vienna he joined CERN in 2002 as a member of the EU DataGrid project working on data management issues.  Since November 2002 he is the Deputy Technical Coordinator of the EU DataGrid project.

Norbert Neumeister, HEPHY, Vienna, Austria

Norbert Neumeister studied physics and computer science and received his Ph.D. in particle physics in 1996 from the Vienna University of Technology, Austria. He is a member of the CMS collaboration, where he is coordinating the development of muon reconstruction and selection algorithms within the Physics Reconstruction and Selection project. Before joining CMS he worked on the DELPHI experiment. As a research fellow at CERN he was involved in the design and implementation of the High Level Trigger system of the CMS experiment. At present he holds a position as a research scientist at the Institute for High Energy Physics of the Austrian Academy of Sciences.

Andreas  Pfeiffer, CERN, Geneva, Switzerland

Andreas Pfeiffer has studied Physics in Giessen and Heidelberg, where he got his Ph.D. in 1988. Since then he was working until 1998 with the University of Heidelberg on the CERES/NA45 experiment at CERN, studying the creation of e+e- pairs in ultra relativistic heavy ion collisions at the SPS. His main responsibilities were computing, both on-line/DAQ and offline/simulation. In January 1999 he joined CERN to lead the Anaphe (former LHC++) project in IT division. With the startup of the LHC Computing Grid (LCG) project, Andreas is now working in the PI project of LCG and on analysis related issues in the CMS experiment.

Meinhard Regler, HEPHY, Vienna, Austria

Meinhard Regler studied physics in Vienna. After graduation he worked at the Institute of High Energy Physics in Vienna. In 1970 he moved to CERN as a staff member. In 1975 he left CERN and returned to the institute where he was appointed as the leader of the experimental group. He has been lecturer and, since 1989, titular professor at the University of Technology in Vienna. Since 1993 he is Deputy Director and member of the Executive Board of the institute. His main research interests are detector design, algorithms for data analysis, and the application of accelerators in medicine. He is the editor and co-author of a book on data analysis methods in high-energy physics.

Heinz Stockinger, CERN, Geneva, Switzerland

Dr. Heinz Stockinger has a Fellow Position at CERN where he is working in the  European DataGrid project (EDG). Within EDG he is the Education and Outreach Manager as well as reponsible for replication software in the Data Management workpackage. Heinz holds a Ph.D. degree in Computer Science and Business Administration from the University of Vienna, Austria.

Kurt Stockinger, CERN, Geneva, Switzerland

Dr. Kurt Stockinger is a research fellow in the Database Group at CERN.  He is leading the Optimisation Task of the EU Data Grid Project funded by the European Commission.  His research interests include performance evaluation of parallel and distributed systems, access optimisation of distributed database management systems and Data Grids, and design and implementation of multi-dimensional data structures for large data warehouses. During recent research visits to the California Institute of Technology and the Lawrence Berkeley National Laboratory, Kurt assisted in improving the query performance of large-scale scientific analysis frameworks.  He received a Ph.D in Computer Science and Business Administration from the University of Vienna, Austria, under the supervision of the CERN Database Group.

Are Strandlie, CERN, Geneva, Switzerland

Dr. Are Strandlie received his Master of Science degree in Theoretical Physics in 1995 and his Doctor of Science degree in Experimental Particle Physics in 2000, both from the University of Oslo. He joined CERN as a Fellow in 2001, where he is working on software development for the CMS Tracker. Strandlie's research interests are mainly concentrated around track reconstruction problems with a primary focus on adaptive methods.

Theodore Todorov, IReS, Strasbourg, France

Theodore (Teddy) Todorov is an experimental particle physicist working on and coordinating the development of the reconstruction software for the CMS inner tracker. He received his Ph.D. in physics in 1993 from the Université Louis Pasteur in Strasbourg.  Before joining CMS (in 1995) he worked in the DELPHI collaboration.

Paolo Tonella, Istituto di Cultura, Trento, Italy

Paolo Tonella received his laurea degree cum laude in Electronic Engineering from the University of Padua, Italy, in 1992, and his PhD degree in Software Engineering from the same University, in 1999, with the thesis "Code Analysis in Support to Software Maintenance".  Since 1994 he has been a full time researcher of the Software Engineering group at IRST (Institute for Scientific and Technological Research), Trento, Italy. He participated in several industrial and European Community projects on software analysis and testing. He is now the technical person responsible for a project with the Alice, ATLAS and LHCb experiments at CERN on the automatic verification of coding standards and on the extraction of high level UML views from the code.  In 2000-2001 he gave a course on Software Engineering at the University of Brescia.  Now he teaches Software Analysis and Testing at the University of Trento. His current research interests include reverse engineering, object oriented programming, web applications and static code analysis


Text: Jackie Franco-Turner
Web: Pietro Paolo Martucci
Last update: 4 June 2003