CSC2005 Software Technologies Track

Coordinators:

Patricia MacBride, FNAL
Pere Mato, CERN
 
 

 

This track presents modern techniques for software design and modern tools and technologies for understanding and improving existing software.  The emphasis will be placed on the large software projects and large executables that are common in HEP. The track will consist of lectures and exercises. The first series lectures cover a series of tools and techniques which will be exemplified during the initial series of exercises. These lectures will include topics such software engineering, design, methodology and testing.  The second series of lectures will focus on a number of technologies that are relevant and commonly used for building interactive and distributed applications.

In addition to pure software design and development issues, the track is complemented by two speicla yet essential topics: methods and techniques for improving computer security, and Internet quality of service and network performance.

In the exercise sessions, the students will have a chance to use the tools that are described in the lectures. They will work configuration management tools. They will be asked to use the test and debugging tools on some simple examples. By showing how these tools can locate known problems, students will learn how to use them on new problems. Students will then be given a functional program and a brief description of what it does. The goal is to extend the program to handle a larger problem domain. In the final exercise sessions, the students will practice the new technologies introduced with examples. For these exercises the Python language will be used.

Overview

Type

Series

Lecture

Description

Lecturer

     

 

 

Lectures

 

Tools and Techniques

Lecture 1

Introduction to the Track

To start, we discuss some of the characteristics of software projects for high energy physics, and some of the issues that arise when people want to contribute to them. This forms the framework for the Software Technologies Track. We then continue with a brief introduction to software engineering from the perspective of the individual contributor, both as a formal process and how it actually effects what you do.

Bob Jacobsen

Lecture 2

Tools You Can Use

This lecture discusses several categories of tools & techniques you can use to make yourself more productive and effective. Continuous testing and documentation has proven to be important in producing high quality work, but it's often difficult to do; we discuss some available approaches. Many problems require specific tools and techniques to solve them effectively: We discuss the examples of performance tuning and memory access problems

Bob Jacobsen

Lecture 3

Tools for Collaboration

HEP software is built by huge teams. How can this be done effectively, while still giving people satisfying tasks to perform?
This lecture discusses some of the technical approaches used. Source control (e.g. CVS) is becoming common, so we just skim over it's advantages and disadvantages to get to the larger area of release control (e.g. CMT) and release testing & distribution. We'll focus on why is this considered a hard problem, and what are the current techniques for dealing with it.

Bob Jacobsen

Lecture 4

Software Engineering Across the Project

Now that we've covered both individual and group work, we go back to the software engineering topics of the first lecture to see how these fit together. How does our individual work effect the ability of the entire project to proceed? What are tools and techniques that will improve both our individual work, and out contributions to the whole?
We close with a summary of observations.

Bob Jacobsen

Exercises

 

Tools and Techniques

Exercise 1
Exercise 2
Exercise 3
Exercise 4

Exercises 1 and 2

The first two exercises provide some direct experience with the tools and techniques described in Lectures 1 and 2. In particular, pairs of students will work together to update existing applications, working through examples designed to show the strengths and weaknesses of several approaches.

Exercises 3 and 4

After the two-person teams acquire some experience with the CMT release system, and CVS if needed, we will have groups of 5 teams work together to create a functional release from individual sub-projects at various stages of completion. Although a limited exercise, this is intended to demonstrate some of the real issues discussed in the lecture.

Bob Jacobsen

 

 

 

 

 

Lectures

 

Technologies for Interactive & Distributed Computing

Lecture 1

Python scripting language

In the first session an introduction to Python, an object-oriented interpreted language will be given. After an introduction into the syntax of Python including data types, functions, control structures (like if, for, try) the aspect of loadable modules will be presented. An introduction into basic File I/O and object persistency in Python will be followed by an introduction to classes, inheritance and other object-oriented aspects of Python. The session will finish with an overview of ways to extend Python through extensions written in other languages such as C and C++.

Andreas Pfeiffer

Lecture 2

XML and related technologies (XSLT, ...)

The second session will introduce XML, the eXtensible Markup Language, as a standard for document markup. XML defines a generic syntax to mark up data with simple, human readable tags; thus providing a flexible standard format for a variety of application domains. The session will introduce the fundamental entities of XML, ranging from elements and attributes, through namespaces to Document Type Definitions (DTDs) and the definition of a valid XML document. Based on this, the session will then look into the area of styling XML documents using the eXtensible Stylesheet Language (XSL) in it's forms of XSLT and XSL-FO. Extensions to XML like XPath, XLink and XPointer will be discussed in the context of Data-centric Documents. The session will conclude with an overview of the DOM and SAX models of representing the structure of an XML document.

Andreas Pfeiffer

Lecture 3

Distributed computing technologies and protocols (SOAP, XMLRPC, Web services, ...)

Web services provide a standard means of communication among different software applications, running on a variety of platforms and/or frameworks. We will cover the definitions of the World Wide Web Consortium (W3C) of Web Services and give an overview of their architecture. We show how XML is used in the context of web services and discuss two major protocols to interchange data between applications/web-services: XML-RPC and SOAP. The various roles which software agents can have in the basic architecture (Service requestor, Service provider and Discovery agency) are discussed as well as model and XML format for describing Web services like the Web Service Description Language (WSDL) and . WSDL enables one to separate the description of the abstract functionality offered by a service from concrete details of a service description such as "how" and "where" of that functionality.

Andreas Pfeiffer

Exercises

 

Technologies for Interactive & Distributed Computing

Exercise 1
Exercise 2
Exercise 3

Exercises on Technologies for Interactive & Distributed Computing

Andreas Pfeiffer

 

 

 

 

 

Lectures

 

Computer Security

Lecture 1

An Introduction to Cryptography

Computer security relies on a number of complementary technologies.  Cryptography is one of them. Unlike what is sometimes believed, cryptography's role i not only to ensure the confidentiality of exchanges. It also serves to protect the integrity of transmitted information, and more importantly in Grid environments to authenticate individuals and systems. The lecture describes he fundamentals of asymmetric encryption, and explain how it is implemented in the real world.

Alberto Pace

Lecture 2

An Introduction to PKI

Cryptography is not sufficient to ensure that secret information is safely shared. In particular, distributing cryptographic keys requires an infrastructure of logically connected systems. This is called Pubic Key Inftastructure and is the subject of this lecture.

Alberto Pace

Lecture 3

An Introduction to Kerberos

Kerberos is an alternative to PKI fro authentication. This third lecture explains the respective positioning and the differences. It also explains how the two technologies can be integrated. This is illustrated by practical examples drawn from web and mail services.

Alberto Pace

 

 

 

 

 

Lectures

 

Networking QoS and Performance

Lecture 1

Internet QoS options

Improving Quality of Service guarantees and performances in data network is a key requirement of Grid computing. Indeed, fast transfers require high-bit rate connections, and grid operation requires network predictability and high availability. On the other hand, the Internet historical technology is not naturally best suited to deterministic behavior. This lecture explains the technical challenges and the range of options available to improve QoS guarantees in Internet-based networks.

François Fluckiger

Lecture 2

TCP and Congestion Control

Not only the underlying network has to be highly performing, but the network software running within the end-systems must have an optimal behavior. This lecture recalls the basics of TCP and discusses the relationships between TCP and the risks of congestions over Internet-based connections.

François Fluckiger

Lecture 3

Multimedia over the Internet

The Grid is not only a network of computer resources but also a network of people cooperating to use these resources. Part of the collaborative tools scientists are increasingly using include audio and video systems. They place new challenging requirements on the networking systems. The class discusses these requirements and their consequences on the end-systems as well as within the underlying network.

François Fluckiger