File Processing
COURSE PLAN
• Course : File Processing
• Lecturer : Dr.Asmaa Elsaid
• Assessment : Total marks 100%
Coursework (Lab +Midterm+ Quizzes+ project) 40%
Final Exam 60%
COURSE PLAN (cont)
Lecture Notes : Available
• References for course:
—File Structures, An Object-Oriented Approach with C++, by
Michael J. Folk, Bill Zoellick and Greg Riccardi
COURSE PLAN (cont)
—Visual C# How to Program (Deitel Series) 6th Edition
Chapter 1
Chapter 1
Introduction To File
Organization
Data processing
• Storage of data
Data is saved within the files, and then the files are
saved in storage devices, that are either primary
storage devices (RAM) or Secondary storage devices
(Hard disk).
• Organization of data
Deal with how can we organize files in Primary and Secondary
storage
• Access to data
The processing of data after storage, organization,
and access.
• Processing of data
We can use Sequential or Direct Access.
Data Structures versus File Structures
• Data Structures versus File Structures
• Both involve
Representation of Data + Operations for
accessing data
• Difference
- Data Structures deal with data in main
memory
- File Structures deal with data in secondary
storage (Files)
File and Computer Storage
• What is a File?
❖File is a collection of data placed under secondary or
non-volatile storage.
• Secondary storage Examples:
anything that you can store in a disk, hard drive,
tape, optical media, and any other medium that
doesn’t lose the information when the power is
turned off
• Computer Storage can be secondary storage
or main memory.
Computer Storage
A.mp3 first.java mo.flv …………
How fast is main memory in comparison to
secondary storage?
➢ Typical time for getting info from
main memory 120 nanoseconds
magnetic disks 30 milliseconds
• By analogy keeping the same time proportion as above
Looking at the index of the book: 20 secs
Versus
going to the library for 54 days
❑ Main Memory
✓Fast: since electronic
✓Small: since expensive
✓Volatile information is lost when a power failure occurs
❑ Secondary Storage
✓Slow: since electronic and mechanical
✓Large : since cheap
✓Stable, persistent, information is preserved longer
How fast is main memory in comparison to
secondary storage?
❑ Semiconductor RAM is about 100,000 times faster, but it
is expensive And volatile
❑ File is large and cheap, but access to files is very slow!
❑ Why is file storage so slow?
✓ File storage systems have many moving parts. In
contrast, semiconductor memory has no moving parts.
Why do we need files?
• Files are the only suitable way to store sizable
amounts of certain types of information:
pictures, music, and video are some examples.
• Most operating systems are too complex and
large to be fully loaded into memory.
• Databases are everywhere.
• Data backup and archiving.
File Structures:
A file has two Structures:
1. Logical structure(Logical file):
The logical structure of a file is how programmers
see it (text, image, data file, …….).
2. Physical structure(physical file):
The physical structure of a file is how it stored on the
secondary storage is.
File Organization
Unorganized Organized
File Organization
• Logical file Organization methods:
❖Heap, indexing, hash….
• Physical file Organization methods:
❖Organizing Track by sectors
❖Organizing Track by Blocks
• File Organization methods must provides:
❖Fast Access Time
❖good space Utilization
File Management
File Management Involves techniques for:
❖Preparing/formatting storage media to store data.
❖Allocating storage space and addressing.
❖Managing free spaces.
File Managers:
Files are managed by System Software:
❖OS (Operating System)
❖DBMS (Data Base Management System)
Course objectives:
• Minimize the number of trips to the disk to get
the desired Information Ideally get what we
need in one disk access or get it with as few
disk accesses as possible.
• Grouping related information so that we are
likely to get everything we need with only one
trip to the disk (e.g. name, address, phone
number, account balance).
Chapter 2
FUNDAMENTAL FILE
PROCESSING OPERATIONS
Sample programs for file manipulation
• Example1:A program to display the contents of a
text file on the screen
• Solution
1.Open file for input (reading)
2.While there are characters to read from the input file
3.Read a line from the file
4.Write the line to the screen
5.Close the input file
Code
A C# program for doing this task
// Example1.cs
// list contents of file using C# stream classes
using System;
using System.IO;
class Example1
{
static voidMain()
{
string line;
FileStream myfile; // declare unattached fstream
string filename;
Console.WriteLine("Enter the name of the file: "); // Step 1
filename = Console.ReadLine(); // Step 2
myfile = new FileStream(filename, FileMode.Open,FileAccess.Read); // Step3
StreamReader mystream = new StreamReader(myfile); // Step3
while (true)
{
line = mystream.ReadLine(); // Step 4a
if (line == null) break;
Console.WriteLine(line); // Step 4b
}
mystream.Close(); //Step5
myfile.Close(); //Step5
Console.ReadLine();
} // end method Main
} // end class Example1
Physical Files and Logical Files
❖Physical file: a collection of bytes physically
exists on secondary storage; known by the
operating system; appears in its file directory.
❖Logical file: a file(“channel” like a telephone line)
used inside the program that links between the
program and the physical file.
Physical Files and Logical Files
• The program(application) sends(or receives) bytes to
(from) a file through the logical file. The program
knows nothing about where the bytes go (came from).
• The operating system is responsible for associating a
logical file in a program to a physical file on disk or
tape. Writing to or reading from a file in a program is
done through the operating system.
• The physical file has a name, for instance myfile.txt
• The logical file has a logical name used for referring to
the file inside the program. This logical name is a
variable inside the program, for instance out file
Physical Files and Logical Files
• In C# the logical name is the name of an object
of the class fstream fstream outfile;
• The logical name outfile will be associated with
the physical file myfile.txt at the time of opening
the file as we will see next.
Opening Files
• Opening a file makes it ready for use by the program Two options for
opening a file
FileStream <object_name> = new FileStream( <file_name>,
<FileMode Enumerator>); ‘sample.txt’
object here is
FileStream <object_name> = new FileStream( <file_name>, the
<FileMode Enumerator>, <FileAccess Enumerator>); physical File
...
Example:
FileStream F = new FileStream("sample.txt", FileMode.Open);
FileStream F = new FileStream("sample.txt", FileMode.Open, ‘F’ object
FileAccess.Read);
here is
StreamReader mystream = new StreamReader(myFile); the logical File
create a new file
• myfile = new FileStream(filename, FileMode.Create, FileAccess.Write,
FileShare.None);
• StreamWriter mystream = new StreamWriter(myFile);
• When we open a file we are positioned at the beginning of the file.
• The first argument determines the path to the physical file
• The second argument indicates the mode.
File Modes
• The File Modes are:
✓ FileMode.Append: If the file already exists, the new data will be added to
its end. If the file doesn't exist, it will be created and the new data will be
added to it.
✓ FileMode.Create: If the file already exists, it will be deleted and a new
file with the same name will be created. If the file doesn't exist, then it will
be created.
✓ FileMode.CreateNew: If the new already exists, the compiler will throw
an error. If the file doesn't exist, it will be created.
✓ FileMode.Open: If the file exists, it will be opened. If the file doesn't
exist, an error will be thrown.
✓ FileMode.OpenOrCreate: If the file already exists, it will be opened. If
the file doesn’t exist
✓ FileMode.Truncate: If the file already exists, its contents will be deleted
completely but the file will be kept, allowing you to write new data to it. If
the file doesn't exist, an error will be thrown., it will be created.
File Access
• The third argument indicates the access type.
• File Access are:
✓ FileAccess.Write: New data can be written to the file.
✓ FileAccess.Read: Existing data can be read from the file.
✓ FileAccess.ReadWrite: Existing data can be read from the
file and new data be written to the file.
• The fourth argument indicates file sharing. Its values are:
✓ FileShare.None: The file cannot be shared.
✓ FileShare.Read: The file can be opened and read from.
✓ FileShare.Write: The file can be opened and written to.
✓ FileShare.ReadWrite: The file can be opened to write to it
or read from it.
Closing Files
• This is like “hanging up” the line connected to a file.
• After closing a file, the logical name is free to be associated with
another physical file.
• Closing a file used for output guarantees that everything has been
written to the physical file.
• We will see later that bytes are not sent directly to the physical file
one by one; they are first stored in a buffer to be written later as a
block of data.
• When the file is closed the leftover from the buffer is flushed to the
file.
• Files are usually closed automatically by the operating system at
the end of program’s execution.
• It’s better to close the file to prevent data loss in case the program
does not terminate normally
• myfile.Close()
StreamReader and StreamWriter
Classes
• The StreamReader and StreamWriter classe
s are used for reading from and writing data to
text files.
StreamReader sr = new StreamReader(F);
StreamWriter sw = new StreamWriter(F);
QUIZ
• Using C# to write a program that copies
the content of file 1 and names it as file 2
using System;
using System.IO;
class Example2
{
static void Main()
{string line;
string filename1, filename2;
Console.WriteLine("Enter the name of file1: ");
filename1 = Console.ReadLine();
Console.WriteLine("Enter the name of file2: ");
filename2 = Console.ReadLine();
StreamReader file1 = new StreamReader(filename1);
StreamWriter file2 = new StreamWriter(filename2);
while ((line = file1.ReadLine()) != null)
{
file2.WriteLine(line);
}
file1.Close();
file2.Flush();
file2.Close();
Console.ReadLine();
} // end method Main
} // end class Example
Detecting End_of_File
• When we try to read and the file has ended, the
read was unsuccessful. We can test whether
this happened in different ways.
• For example: -
mystream.ReadToEnd()
while (mystream.ReadLine() != null)
Accessing the Contents Of A File In Random Order
• C# allows you to access the contents of a file in random
order. To do this, you will use the Seek() method defined by
FileStream.
• The Seek() method allows you to set the file Position
indicator(also called the file pointer) to any point within a file.
• The method Seek() is shown here:
long Seek(long newPos, SeekOrigin origin)
Here, newPos specifies the new position, in bytes, of the
file pointer from the location specified by the origin.
• The origin will be one of these values, which are defined by
the SeekOrigin enumeration:
Accessing the Contents Of A File In Random
Order
Value Meaning
SeekOrigin.Begin Seek from the beginning of the
file.
SeekOrigin.Current Seek from the current location
SeekOrigin.End Seek from the end of the file
After a call to Seek(), the next read or write operation will
occur at the new file position.
Example
• myfile.Seek(3030, SeekOrigin.Begin)
Moves to byte 3030 in myfile.
• myfile.Seek(0, SeekOrigin.End)
Moves to the last byte in my file.
• myfile.Seek(-10, SeekOrigin.Current)
Moves 10 bytes backward from the current position.
❑ Example: Here is an example that demonstrates
random access I/O. It writes the uppercase alphabet
to a file and then reads it back in non-sequential
order.
FileStream myfile;
char ch;
myfile = new FileStream("random.txt", FileMode.Open);
//Now, read back specified values
myfile.Seek(0, SeekOrigin.Begin); //seek to first byte
ch = (char)myfile.ReadByte();
Console.WriteLine("First value is " + ch);
myfile.Seek(1, SeekOrigin.Begin); // seek to second byte
ch = (char)myfile.ReadByte();
Console.WriteLine("Second value is " + ch);
myfile.Seek(4, SeekOrigin.Begin); //seek to 5th byte
ch = (char)myfile.ReadByte();
Console.WriteLine("Fifth value is " + ch);
myfile.Seek(7, SeekOrigin.Current); //seek to 10th from end byte
ch = (char)myfile.ReadByte();
Console.WriteLine("current value is " + ch);
myfile.Seek(-10, SeekOrigin.End); //seek to 10th from end byte
ch = (char)myfile.ReadByte();
Console.WriteLine("backward value is " + ch);
Console.WriteLine();
//Now, read every other value.
Console.WriteLine("Here is every other value: ");
for (int i = 0; i < 26; i += 2)
{
myfile.Seek(i, SeekOrigin.Begin); // seek to ith byte
ch = (char)myfile.ReadByte();
Console.Write(ch + " ");
}
Console.WriteLine();
myfile.Close();
Console.Read();
Important Notes
▪ The current position is indicated by The file stream property Position
Long pos = myfile.Position;
▪ The file read by the above program should be a text file (Ex: a notepad
file), if you want to read a binary file, you should use The Binary File
Streams:
The BinaryWriter class and BinaryReader class
▪ The Path to a File: If the physical file name includes back-slash “\” ,
use “\\” for each back-slash
string NameOfFile = "C:\\Documents and Settings\\Employees.spr";
▪ File Existence
• One of the valuable operations that the File class can perform is to
check the existence of the file you want to use.
• To check the existence of a file, the File class provides the Exists
method. Its syntax is:
public static bool Exists(string path);
Important Notes
• If you provide only the name of the file, the
compiler would check it in the folder of the
application. If you provide the path to the file,
the compiler would check its
• drive, its folder(s) and the file itself. In both
cases, if the file exists, the method returns true.
If the compiler cannot find the file, the method
returns false.
Methods for manipulating and determining information about files.
Assignment #1
• Using C# to write a program that reads the data
of students, such as their ID,name and phone
number, from the user and displays all the data
about them, as shown in the screenshot, using
suitable file modes
Thank You