main | forum

May 22nd, 2025

CIS 751

Main

Syllabus

Overview

Forum

UPLOAD PROJECTS

Notes

0001

0002

0003

0004

0005

0006

0007

0008

0009

0010

0011

0012

0013

0014

SIGNAL Notes

Database Notes

Final(Take Home)

Notes 0001

Introduction

Welcome to Multimedia Computer Systems. The term `Multimedia' refers to a whole slew of technologies. Our purpose is to gain an understanding of most of those technologies.

Types of Multimedia

What do you think of when someone says Multimedia? Something that involves many media :-)

I think it's safe to say that Multimedia is something that you see/ hear/ touch/ sense, etc., anything that stimulates your senses. In a computer sense, multimedia usually means Text (to a lesser extent), Images, Audio, Video, or a combination of these.

Themes

Although there are many different forms of multimedia, there are recurring themes. Remember the saying `a picture is worth a thousand words'? Well, similarly, often we can represent the same multimedia in totally different forms that lend themselves to different forms of manipulation.

An example of this would be using SVG (Scalable Vector Graphics) as opposed to BMP (Bitmap) files, or similarly using MIDI (Musical Instrument Device Interface) instead of WAV files, or similarly using TXT instead of PS (PostScript) files.

These types of differences show up all the time. Very often we create media using this semantic representation: we use layers in Photoshop, we symbolically write down music, or use our text editor to write a book. When we publish the media we've created, we often reduce it to a more suitable form for consumption. After all, you can't hear music if it's written down on paper, or enjoy an image if it's just a series of XML tags.

The type of data we are dealing with also effects the type of algorithms we can use to compress it. For example, while we can use lossy compression (compression that looses `some' information) to raster images, video, and sound, we generally cannot apply it towards text, vector graphics, or symbolic music.

Overview

This section will briefly go over some of the key topics we'll be discussing in more detail later.

Text Typesetting (LaTeX)

This may seem as far away from multimedia as you can possibly get, but books, and printed material are probably the oldest form of multimedia there is.

Basically the process goes like this: you type text (using your favorite text editor), then you somehow transform this semantic representation into something that can be consumed by the eyes of the reader: you print it.

(This is not really the whole picture, since while you're typing it, you're actually seeing the on-screen representation of your text, not the actual text of your document. You can't see the actual text - it's just a series of bits.)

Anyway, taking your text, and converting it to a form suitable for publishing is often accomplished by packages like LaTeX. They take text and turn it into what is basically an image meant for a printer.

LaTeX is usually a bit cleverer than that though, and can actually turn your documents into files with scalable characters, suitable for PDF documents. It also allows you to typeset mathematical formulas, create diagrams, and aids in usual book writing chores (like creating indexes, and cross-references). It is the thing to use for writing books and pretty much anything that's worth writing. Here's a small LaTeX document:

\documentclass{article} \begin{document} This is a small LaTeX doc :-) \end{document}

To `compile' this document (you install the LaTeX package and then) you just run:

pdflatex thedoc.tex

You can find a lot more examples in this: The Not So Short Introduction to LATEX2

(Notice the `theme': we take a semantic version of the document, that is easily manipulatable in a text editor, and we turn in into a PostScript or PDF documents, and all of a sudden, it is suitable for printing and/or viewing/reading, but we can no longer manipulate the text.)

Images, Vector Graphics, Raster

Images are often `created' by taking a picture with a digital camera, or scanning in a photograph (or a drawing), or creating the image in a program like Photoshop or GIMP. Such an image is usually a two-dimensional array of pixels (or color values).

A more abstract view of images that has semantic information is vector graphics. This is equivalent to `text' in the previous section. Vector graphics represent images by specifying location, direction, color, etc., of lines, curves, and other geometric shapes. Just like text, it is easy to manipulate (you can add a line or erase a line), and just like text, it needs to be `rendered' for anyone to actually see the image.

There are many vector graphics format standards, and the prominent one these days is SVG (Scalable Vector Graphics). You can create vector graphics in numerous programs, including Adobe Illustrator, Dia, XFig, or just editing the SVG file directly (it is an XML based format).

The term `raster' graphics refers to the creation and manipulation of images as a series of pixels. As a two-dimensional array of pixels (ie: The images).

Where this comes into play is with computer video displays: nearly all are raster based. Your screen displays a series of pixels. It doesn't care for semantic information. It doesn't care if you're displaying text, or vector graphics, or images, the only thing your monitor does is light up dots on the screen; which makes things nice and easy for the human eyes (which work by detecting these light `pixels').

Audio, MIDI, Wave

Audio at first appears like a totally different concept from images, but in fact, it involves pretty much the same ideas. For example, a WAV file is similar to a raster based image. It has the actual sound wave encoded at various time intervals (just like the image has pixels at various locations).

While MIDI, just like vector graphics, represents the more semantic information about the sound. MIDI encodes instruments, and what they play at what times, but does not encode the actual sound wave (vector graphics encodes what shapes are drawn where, but doesn't encode the actual pixels of those shapes). That is the job of a synthesizer (which is conceptually similar to a raster display); making the semantic MIDI information ready for consumption by the human ear.

Video, Flash, Animation

At its basic level, video is just a series of images, stacked on top of the other. Due to the limitation of the human eye, and some properties of video displays, what we see is not a series of images, but a smooth video.

There are some limitations though. If the images are not flipping fast enough, we humans will spot the individual images, and see what is known as `flicker'. Usually, images need to be flipped at a rate of 60 fps for us not to see flicker (this is avoided on TV screens by the use of interlacing).

From a more semantic point of view, vector graphics can also be moved around, creating animation. Flash, a commonly used language/system for generating animations (moving pictures) on the web is one such system. The author specifies objects like lines, curves, etc., and how and where they move or transform. The viewer downloads this semantic information, and the Flash plug-in renders it on a user's raster display.

Vector Graphics/MIDI vs. Raster/Wave

There are advantages and disadvantages to working with semantic information. One of the major advantages is that often, semantic information is much smaller. A five minute MIDI tune can be about 50kb, while a similarly long WAV would be around 50MB (It would be around 5MB for an MP3 file, ie: still huge compared to MIDI).

The other major advantage is ease of manipulation. If you have a MIDI file, you can hear how it sounds without some instruments. You can also add instruments. You can also change the instruments themselves (play a melody via a piano instead of a guitar). These things would be a major pain (if not impossible) with a WAV file.

So what's the catch? If it's so great, why use anything else? Well, it turns out that it's quite hard to create this semantic version. It is much harder to draw a good image in Adobe Illustrator, than it is to take a digital picture with a camera. Or to create a MIDI file than it is to just record a live band performing (well, this is debatable; but certainly the live recorded band would sound `more natural' than synthesized MIDI music).

Also, it is easier to work with and manipulate the wave form (or raster images) than it is to work with their semantic information. For example, applying filters in Adobe Photoshop seems more natural if you're working with raster graphics.

Another major issue is the lack of viewing/listening tools. For example, I know that visitors to my website will almost certainly have the ability to view JPEG, GIF, and PNG images, but I'm doubtful if most visitors can open (have proper plug-ins for) SVG files.

© 2006, Particle