CSE240: Programming Language Concepts

Data Types and C/C++ Fundamentals

Welcome to this comprehensive course on programming language concepts, focusing on data types and fundamental C/C++ programming. This course will provide you with a solid foundation in understanding how programming languages handle data and how to effectively program in C and C++.

Course Overview

This module covers three main learning objectives:

  • LO1: Data typing terminology and approaches
  • LO2: Basic I/O development in C and C++
  • LO3: Working with basic data types and C-style strings

Learning Philosophy

Programming is best learned through hands-on practice. As you progress through this course:

  1. Read each section carefully
  2. Run the provided code examples
  3. Experiment with modifications
  4. Predict outcomes before testing
  5. Build your mental model of how code works

💡 Pro Tip

Don't just read code—tinker with it! Try removing features, adding new ones, or combining concepts. This empirical approach will dramatically improve your understanding.

Module 1: Data Types and Terminology

Understanding data types is fundamental to programming language design and usage. This module explores the terminology, approaches, and concepts that underpin how programming languages handle different kinds of data.

Module Objectives

  • Master common terminology for types
  • Understand type equivalence concepts
  • Distinguish between strong and weak typing
  • Appreciate orthogonality in language design

1.1: Common Terminology for Types

Before diving into programming with types, we need to establish a common vocabulary. Understanding these terms will help you communicate effectively about programming languages and their type systems.

What is a Type?

A type is a classification of data that tells the compiler or interpreter how the programmer intends to use the data. Types determine:

  • What values are possible
  • What operations can be performed
  • How much memory is required
  • How the data is represented in memory

Key Terminology

Primitive Types

Primitive types are the basic data types provided by a programming language. They are not composed of other types.

// Examples of primitive types in C int age = 25; // Integer type float temperature = 98.6f; // Floating-point type char grade = 'A'; // Character type

Composite Types

Composite types are constructed from one or more primitive or other composite types.

// Examples of composite types in C int numbers[10]; // Array type struct Point { // Structure type float x, y; };

Type System

A type system is a logical system comprising a set of rules that assigns a property called a type to every expression in a programming language.

Static vs Dynamic Typing

  • Static Typing: Types are determined at compile time (C, C++, Java)
  • Dynamic Typing: Types are determined at runtime (Python, JavaScript)

Type Declaration vs Type Inference

Type declaration requires the programmer to explicitly specify the type, while type inference allows the compiler to deduce the type automatically.

// Explicit type declaration in C int count = 10; // Type inference in C++ (auto keyword) auto count = 10; // Compiler infers int type

Interactive Type Explorer

Enter a value and see how it might be typed:

1.2: Structural and Name Equivalence

When are two types considered equivalent? This question is crucial for type checking and determines when assignments and parameter passing are allowed.

Name Equivalence

Name equivalence means two types are equivalent only if they have the same name. This is the strictest form of type equivalence.

// Example of name equivalence issues typedef int Celsius; typedef int Fahrenheit; Celsius temp1 = 25; Fahrenheit temp2 = 77; // In strict name equivalence, this would be an error: // temp1 = temp2; // Different type names!

Structural Equivalence

Structural equivalence means two types are equivalent if they have the same structure, regardless of their names.

// Structural equivalence example struct Point2D_A { float x, y; }; struct Point2D_B { float x, y; }; // These would be structurally equivalent // (same structure, different names)

⚠️ C/C++ Type Equivalence

C and C++ use name equivalence for user-defined types (structs, unions, enums) but structural equivalence for built-in types and pointers.

Implications of Type Equivalence

Assignment Compatibility

Type equivalence determines when assignments are valid:

// C example showing type compatibility int a = 10; float b = 3.14f; a = b; // Usually allowed (with possible warning) // Involves implicit conversion

Function Parameter Matching

Function calls must match parameter types according to the language's equivalence rules:

void process_temperature(Celsius temp) { printf("Temperature: %d°C\n", temp); } Fahrenheit f_temp = 77; // process_temperature(f_temp); // May be error under strict name equivalence

Real-World Application

Understanding type equivalence helps you design APIs that are both type-safe and flexible. Many bugs occur when programmers assume structural equivalence but the language enforces name equivalence.

1.3: Strong versus Weak Type Checking

The strength of a type system determines how strictly the language enforces type rules and how it handles type mismatches.

Strong Typing

Strong typing means the language prevents type errors by restricting or forbidding implicit conversions between different types.

Characteristics of Strong Typing:

  • Type errors are caught at compile time or runtime
  • Implicit conversions are limited
  • Type safety is prioritized
  • Programs are more predictable
// Example: Java (strongly typed) String name = "Alice"; int age = 25; // String result = name + age; // Error in strongly typed system String result = name + String.valueOf(age); // Explicit conversion required

Weak Typing

Weak typing allows more implicit conversions between types, potentially leading to unexpected behavior.

Characteristics of Weak Typing:

  • More implicit conversions allowed
  • Flexibility over safety
  • Potential for subtle bugs
  • May require runtime type checking
// Example: C (relatively weakly typed) char ch = 'A'; int ascii_val = ch; // Implicit conversion allowed printf("%d\n", ascii_val); // Prints 65 // Pointer conversions (dangerous!) void* ptr = malloc(sizeof(int)); int* int_ptr = ptr; // Implicit conversion from void*

C and C++ on the Spectrum

C and C++ fall somewhere in the middle of the strong/weak typing spectrum:

C: Moderately Weak

  • Allows many implicit conversions
  • Pointer arithmetic and void* conversions
  • Limited compile-time type checking

C++: Stronger than C

  • More restrictive than C
  • Better type checking
  • Templates provide type safety
  • Still allows some dangerous operations
// C++ being stricter than C #include int main() { void* ptr = malloc(sizeof(int)); // int* iptr = ptr; // Error in C++, OK in C int* iptr = static_cast(ptr); // Explicit cast required return 0; }

Type Conversion Tester

See how C handles different type conversions:

1.4: Orthogonality in Language Design

Orthogonality is a principle from mathematics applied to programming language design. It refers to the ability to combine language features in all possible ways without restrictions or unexpected interactions.

What is Orthogonality?

Orthogonality in programming languages means that language features can be combined freely without arbitrary restrictions. A highly orthogonal language has:

  • Few restrictions on how features combine
  • Consistent behavior across contexts
  • Minimal special cases
  • Predictable feature interactions

Benefits of Orthogonality

Advantages:

  • Simplicity: Fewer rules to remember
  • Regularity: Consistent patterns
  • Expressiveness: More ways to express ideas
  • Learnability: Easier to master the language

Examples of Orthogonality

Orthogonal: Pascal Arrays

// Pascal - highly orthogonal var int_array: array[1..10] of integer; real_array: array[1..10] of real; char_array: array[1..10] of char; // Arrays can be of ANY type - fully orthogonal

Less Orthogonal: C Arrays

// C - less orthogonal due to restrictions int int_array[10]; // OK float float_array[10]; // OK // int variable_array[n]; // Not allowed in older C (VLA added in C99) // Functions cannot return arrays directly // int[10] get_array(); // Error - not orthogonal

Orthogonality vs Simplicity Trade-off

While orthogonality generally improves language design, perfect orthogonality can sometimes lead to complexity:

⚠️ Potential Issues

  • Too much flexibility can be overwhelming
  • Some combinations might be inefficient
  • Error checking becomes more complex
  • Implementation complexity increases

C/C++ Orthogonality Analysis

Areas of Good Orthogonality:

  • Pointers can point to any type
  • Most operators work with compatible types
  • Functions can take most types as parameters

Areas of Poor Orthogonality:

  • Arrays and functions have special rules
  • void type has restrictions
  • Some operators don't work with all types
// C orthogonality examples int x = 5; int *ptr = &x; // Can take address of variable // int *ptr2 = &(x + 1); // Cannot take address of expression // Good orthogonality: pointers to any type char *char_ptr; float *float_ptr; struct Point *point_ptr; // Poor orthogonality: array limitations int arr[10]; // sizeof(arr[10]) works, but sizeof(parameter_array) in function doesn't

Design Lesson

When designing systems or APIs, strive for orthogonality where it makes sense. This makes your code more predictable and easier to use, but don't sacrifice performance or safety for perfect orthogonality.

Module 2: C and C++ Basic I/O

This module introduces you to programming in C and C++, focusing on input and output operations. You'll learn the fundamental structure of programs and how to interact with users through the console.

Module Objectives

  • Understand C/C++ as imperative languages
  • Master basic program structure
  • Implement console I/O operations
  • Use formatted and unformatted I/O functions
  • Understand data storage and scoping

2.1: C and C++ as Imperative Languages

C and C++ are primarily imperative programming languages. Understanding this paradigm is crucial for effective programming in these languages.

What is Imperative Programming?

Imperative programming is a programming paradigm that describes computation in terms of statements that change program state. It focuses on:

  • How to solve a problem (not just what the solution is)
  • Step-by-step instructions
  • Mutable state and variables
  • Control flow structures (loops, conditionals)

Key Characteristics of Imperative Languages

1. Sequential Execution

Statements are executed in order, one after another:

#include int main() { int x = 10; // Step 1: Assign 10 to x int y = 20; // Step 2: Assign 20 to y int sum = x + y; // Step 3: Calculate sum printf("Sum: %d\n", sum); // Step 4: Print result return 0; // Step 5: Exit program }

2. Mutable State

Variables can be modified after creation:

int counter = 0; // Initial state counter = counter + 1; // Modify state counter += 1; // Another modification counter++; // Yet another way to modify

3. Control Flow

Programs use control structures to determine execution path:

// Conditional execution if (temperature > 100) { printf("Water boils\n"); } else { printf("Water is liquid\n"); } // Repetitive execution for (int i = 0; i < 10; i++) { printf("Count: %d\n", i); }

Imperative vs Other Paradigms

Paradigm Comparison

  • Imperative: Focus on how (C, C++, Java)
  • Declarative: Focus on what (SQL, HTML)
  • Functional: Focus on functions (Haskell, Lisp)
  • Object-Oriented: Focus on objects (C++, Java)

Why C/C++ Are Imperative

Historical Design

C was designed for system programming, requiring:

  • Direct control over memory
  • Efficient execution
  • Clear mapping to machine instructions

Performance Considerations

The imperative style maps well to computer architecture:

// This C code maps directly to machine instructions int a = 5; // LOAD 5 into register/memory int b = 10; // LOAD 10 into register/memory int result = a + b; // ADD registers, STORE result

Practical Impact

Understanding the imperative nature of C/C++ helps you:

  • Write efficient code
  • Debug effectively by tracking state changes
  • Understand memory management
  • Optimize performance

Imperative vs Declarative Thinking

Compare how you might express "find the sum of numbers 1 to 10":

2.2: Minimal Elements of C and C++ Programs

Every C and C++ program has certain essential components. Understanding these minimal elements is crucial for writing any program, no matter how simple or complex.

The Simplest C Program

#include int main() { return 0; }

Let's break down each component:

1. Preprocessor Directives

#include is a preprocessor directive that tells the preprocessor to include the contents of the stdio.h header file.

Common Header Files:

  • <stdio.h> - Standard input/output
  • <stdlib.h> - Standard library functions
  • <string.h> - String manipulation
  • <math.h> - Mathematical functions

2. The main() Function

Every C program must have exactly one main() function. This is where program execution begins.

int main() { // Function declaration // Program statements go here return 0; // Return statement } // Function end

3. Return Statement

return 0; indicates successful program termination. Non-zero values typically indicate errors.

Anatomy of a Complete Program

/* * Program: Hello World * Author: Student * Purpose: Demonstrate basic C program structure */ #include // Preprocessor directive #include // Another header // Global variable (optional) int global_var = 42; // Function prototype (declaration) void greet_user(void); // Main function - program entry point int main() { printf("Hello, World!\n"); // Function call greet_user(); // Call our function return EXIT_SUCCESS; // Return success code } // Function definition void greet_user(void) { printf("Welcome to C programming!\n"); }

C++ Minimal Program

C++ programs are similar but use different header files and syntax:

#include // C++ style header using namespace std; // Use standard namespace int main() { cout << "Hello, World!" << endl; return 0; }

Program Structure Components

Comments

Document your code for clarity:

// Single-line comment in C99 and C++ /* Multi-line comment Works in all C versions */

Preprocessing Phase

Before compilation, the preprocessor:

  • Includes header files
  • Expands macros
  • Handles conditional compilation
  • Removes comments

Compilation Units

Each .c file is a separate translation unit that gets compiled independently.

⚠️ Common Beginner Mistakes

  • Forgetting to include necessary headers
  • Missing semicolons after statements
  • Incorrect main() function signature
  • Forgetting return statement in main()

Program Execution Flow

  1. Preprocessing: Headers included, macros expanded
  2. Compilation: C code translated to machine code
  3. Linking: External libraries linked
  4. Loading: Program loaded into memory
  5. Execution: main() function called

Program Structure Checker

Check if a program has the essential components:

2.3: Basic Input and Output Programming

Input and output operations are fundamental to most programs. This section covers how to read data from the keyboard and display information to the console in C.

Output with printf()

The printf() function is the most common way to display output in C:

#include int main() { printf("Hello, World!\n"); printf("This is line 2\n"); return 0; }

Basic printf() Usage

  • \n creates a new line
  • Strings must be enclosed in double quotes
  • printf() returns the number of characters printed

Input with scanf()

The scanf() function reads formatted input from the keyboard:

#include int main() { int age; printf("Enter your age: "); scanf("%d", &age); printf("You are %d years old.\n", age); return 0; }

⚠️ The Address-of Operator (&)

Notice the &age in scanf(). The & operator gets the memory address of the variable so scanf() can store the input there.

Complete Input/Output Example

#include int main() { char name[50]; int age; float height; // Get input from user printf("What's your name? "); scanf("%s", name); // No & needed for strings printf("How old are you? "); scanf("%d", &age); printf("How tall are you (in meters)? "); scanf("%f", &height); // Display the information printf("\n--- Your Information ---\n"); printf("Name: %s\n", name); printf("Age: %d years\n", age); printf("Height: %.2f meters\n", height); return 0; }

Common I/O Patterns

Reading Multiple Values

int x, y; printf("Enter two numbers: "); scanf("%d %d", &x, &y); printf("Sum: %d\n", x + y);

Input Validation Loop

int num; printf("Enter a positive number: "); while (scanf("%d", &num) != 1 || num <= 0) { printf("Invalid input. Enter a positive number: "); while (getchar() != '\n'); // Clear input buffer }

C++ Style I/O

C++ provides stream-based I/O that's often easier to use:

#include #include using namespace std; int main() { string name; int age; cout << "Enter your name: "; getline(cin, name); // Read entire line including spaces cout << "Enter your age: "; cin >> age; cout << "Hello, " << name << "! You are " << age << " years old." << endl; return 0; }

C vs C++ I/O Comparison

  • C: printf/scanf - format specifiers, manual address handling
  • C++: cout/cin - type-safe, easier syntax
  • Performance: C is typically faster
  • Safety: C++ is more type-safe

Common Input/Output Challenges

Buffer Issues

Input buffering can cause unexpected behavior:

int num; char letter; printf("Enter a number: "); scanf("%d", &num); printf("Enter a letter: "); scanf("%c", &letter); // May skip due to buffered newline // Better approach: scanf(" %c", &letter); // Space before %c consumes whitespace

String Input with Spaces

char name[100]; // This only reads one word scanf("%s", name); // This reads entire line fgets(name, sizeof(name), stdin);

I/O Practice

Try different input scenarios:

2.4: Formatted Library Functions

Formatted I/O functions in C provide powerful control over how data is read and displayed. Understanding format specifiers and their options is essential for professional C programming.

printf() Format Specifiers

Format specifiers tell printf() how to interpret and display data:

#include int main() { int integer = 42; float decimal = 3.14159f; char character = 'A'; char string[] = "Hello"; printf("Integer: %d\n", integer); printf("Float: %f\n", decimal); printf("Character: %c\n", character); printf("String: %s\n", string); return 0; }

Common Format Specifiers

Basic Format Specifiers

  • %d or %i - signed decimal integer
  • %u - unsigned decimal integer
  • %f - floating-point number
  • %c - single character
  • %s - string
  • %p - pointer address
  • %x - hexadecimal (lowercase)
  • %X - hexadecimal (uppercase)
  • %o - octal

Format Modifiers

Modifiers control the appearance and precision of output:

Width and Precision

float pi = 3.14159; printf("Default: %f\n", pi); // 3.141590 printf("2 decimals: %.2f\n", pi); // 3.14 printf("Width 10: %10.2f\n", pi); // " 3.14" printf("Left-aligned: %-10.2f|\n", pi); // "3.14 |" printf("Zero-padded: %010.2f\n", pi); // "0000003.14"

Integer Formatting

int number = 42; printf("Decimal: %d\n", number); // 42 printf("Hexadecimal: %x\n", number); // 2a printf("Octal: %o\n", number); // 52 printf("With signs: %+d\n", number); // +42 printf("Space for positive: % d\n", number); // 42 printf("Zero-padded: %05d\n", number); // 00042

scanf() Format Specifiers

scanf() uses similar format specifiers for reading input:

#include int main() { int age; float height; char grade; char name[50]; printf("Enter age, height, grade, and name: "); scanf("%d %f %c %s", &age, &height, &grade, name); printf("Age: %d\n", age); printf("Height: %.2f\n", height); printf("Grade: %c\n", grade); printf("Name: %s\n", name); return 0; }

Advanced scanf() Features

Field Width Limiting

char name[20]; printf("Enter name (max 19 chars): "); scanf("%19s", name); // Prevents buffer overflow

Skipping Characters

int day, month, year; printf("Enter date (DD/MM/YYYY): "); scanf("%d/%d/%d", &day, &month, &year);

Reading Different Bases

int decimal, hex, octal; printf("Enter decimal, hex (0x...), octal (0...): "); scanf("%d %x %o", &decimal, &hex, &octal);

sprintf() and sscanf()

These functions work with strings instead of console I/O:

#include int main() { char buffer[100]; int age = 25; char name[] = "Alice"; // Write formatted data to string sprintf(buffer, "Name: %s, Age: %d", name, age); printf("Buffer contains: %s\n", buffer); // Read formatted data from string char parsed_name[50]; int parsed_age; sscanf(buffer, "Name: %s, Age: %d", parsed_name, &parsed_age); printf("Parsed: %s is %d years old\n", parsed_name, parsed_age); return 0; }

⚠️ Security Considerations

  • Always limit string input length with scanf()
  • Check return values to detect input errors
  • Be careful with buffer sizes in sprintf()
  • Consider using safer functions like snprintf()

Error Handling

int number; printf("Enter a number: "); if (scanf("%d", &number) == 1) { printf("You entered: %d\n", number); } else { printf("Invalid input!\n"); // Clear input buffer while (getchar() != '\n'); }

Format Specifier Tester

See how different format specifiers work:

3.4: Structure of C-Style Strings

C-style strings are fundamental to C programming. Unlike higher-level languages, C doesn't have a built-in string type. Instead, strings are implemented as arrays of characters with a special terminating character.

What is a C-Style String?

A C-style string is an array of characters terminated by a null character ('\0').

char greeting[] = "Hello"; // Internally stored as: ['H', 'e', 'l', 'l', 'o', '\0'] // [0] [1] [2] [3] [4] [5]

String Declaration Methods

Method 1: String Literal

char message[] = "Hello, World!"; // Compiler automatically adds '\0' and calculates size

Method 2: Character Array

char message[14] = {'H','e','l','l','o',',',' ','W','o','r','l','d','!','\0'}; // Manual specification - tedious but explicit

Method 3: Fixed-Size Array

char message[50] = "Hello"; // Array has 50 characters, string uses first 6 (including '\0')

The Null Terminator

The null character '\0' is crucial:

  • Marks the end of the string
  • Has ASCII value 0
  • Automatically added by string literals
  • Must be manually added when building strings character by character

⚠️ Missing Null Terminator

Forgetting the null terminator leads to undefined behavior. String functions won't know where the string ends!

char bad_string[5] = {'H', 'e', 'l', 'l', 'o'}; // No '\0'! // printf("%s", bad_string); // Undefined behavior - may print garbage char good_string[6] = {'H', 'e', 'l', 'l', 'o', '\0'}; // Correct printf("%s", good_string); // Safe to print

String vs Character Array

Key Differences:

  • String: Null-terminated character array
  • Character Array: Just an array of chars (may not be null-terminated)
  • String Functions: Work only with null-terminated strings

Common String Operations

String Length

#include char name[] = "Alice"; int length = strlen(name); // Returns 5 (doesn't count '\0')

String Copy

char source[] = "Hello"; char destination[20]; strcpy(destination, source); // Copies "Hello\0" to destination

String Comparison

char str1[] = "apple"; char str2[] = "banana"; int result = strcmp(str1, str2); // Returns: negative if str1 < str2 // zero if str1 == str2 // positive if str1 > str2

String Structure Explorer

Enter a string to see its internal structure: