Snack manual, version 2.1

Installing Snack

First you need to install Snack according to its installation instruction. In order to use Snack you need to put the file tkSnack.py somewhere in your Python path.

Using Snack: an overview

Initializing

You need to use Tkinter in order to use Snack. Even if you don't use any GUI elements that Tkinter offers, you will still need an active Tk object in your program. In order for Snack to identify the Tk object it should use, you will need to run the initializeSnack procedure.

The beginning of a program that uses Snack might look like:

from Tkinter import *
root = Tk()

import tkSnack
tkSnack.initializeSnack(root)

# Now you can use tkSnack commands and objects
# ...

Using Sound objects

You create sound objects the same way you create any Python objects.

mysound = tkSnack.Sound()

Since we gave no additional arguments, the sound object created by this will contain no sound data. We can give it some data in a number of ways -- by recording from the current input channel, by reading from a file, and so on.

Let's try reading from a file. If you're using Windows, you have at least a few WAV files sitting conveniently on your hard drive. (If you're using another operating system, you'll have to locate your own sound files to read from :-).)

mysound.read('c:/windows/media/chord.wav')

Now for the moment of truth. Try playing your sound with:

mysound.play()

You can create a new sound object and load a file in the same step using the "load" option:

tada = tkSnack.Sound(load='c:/windows/media/tada.wav')
tada.play()

(Note: Another possibility is to use tkSnack.Sound(file='filename'). Using file instead of load will "link" the disk file to the sound object instead of immediately loading it into memory. This will limit what you can do to the object, since a number of Snack's usual sound methods are only available to "in-memory" sounds.)

You can perform a number of manipulations on the sounds objects. For example, let's tack a couple of copies of the Windows chord sound onto the end of the ta-da sound. Then we'll delete a few thousand samples from the middle of the sound (those between sample 10,000 and 40,000), and finally reverse the sound.

tada.concatenate(mysound)
tada.concatenate(mysound)
tada.play()
tada.cut(start=10000, end=40000)
tada.play()
tada.reverse()
tada.play()

We can write the sound back to a disk file, and even magically switch the format.

tada.write('mangled-tada.au')

Audio and mixer controls

Snack has two objects that control aspects of your computer's sound system. The audio object gets and sets properties of the sound devices. To find out what the available input devices on your computer are and what sample rates the current input device can record at, try:

tkSnack.audio.inputDevices()
tkSnack.audio.rates()

To turn the output volume up or down to 30%, try:

tkSnack.audio.play_gain(30)

The mixer object controls various aspects of your computer's sound mixers, such as which input jack is currently being used and whether it's recording in stereo or mono.

New canvas objects

Snack provides three new kinds of items that can be drawn on Tkinter Canvases:

waveform: a raw graph of the sound data, i.e., time on the x-axis and sample amplitude on the y-axis.
section: a power spectrum of the sound (at a given time), as calculated by Fast Fourier Transform, i.e., frequency on the x-axis and amplitude on the y-axis.
spectrogram: a spectrogram of the sound, i.e., time on the x-axis, frequency on the y-axis, and amplitude represented by the darkness of the pixel.

These items have the same options as regular Tkinter Canvas items like lines, arcs, etc., and some of their own. Try:

c = tkSnack.SnackCanvas(root, height=400)
c.pack()
c.create_waveform(0, 0, sound=mysound, height=100, zerolevel=1)
c.create_spectrogram(0, 150, sound=mysound, height=200)

The Sound class

Options

The following attributes may be specified using optional arguments in the intialization of the sound object. They may be read or set after initialization by using the methods cget and config/configure. They may also be read or set by treating the sound object as a dictionary, e.g.,

mysound["encoding"] = "Lin32"

The options:

name =identifier: what name Tcl knows your sound under. Not terribly useful inside Python.
load =filename: specifies that the file named by filename should be read into memory after creating the sound. (Using this option allows you to use the in-memory manipulation methods of the Sound object.)
file =filename: specifies the filename of an on-disk file that should be linked to the sound. (Using this option means that many of the in-memory manipulation methods of a Sound object will not be useable.)
channel =channel-name: specifies that audio data resides on a channel which should be linked to the sound. In these cases the audio data is not loaded into memory, which is useful when playing large files or when using streaming audio. However, the Snack canvas types, e.g., waveforms, cannot be linked to sounds of these types.
frequency =integer: The sampling rate of the sound in samples per second.
channels =x: how many channels the sound uses. Values should be an integer greater than or equal to 1, or "Mono" or "Stereo".
encoding =encoding-name: Possible values for the encoding format of the sound are:
fileformat =format-name: Current supported file formats are the following. (These formats can be read -- not all of them can be written.)
skiphead =n: is used to skip an unknown file header of length n bytes.
byteorder =string: "littleEndian" or "bigEndian"
guessproperties =boolean: specifies that Snack should try to infer properties such as byte order, sample encoding format, and sample rate for raw files by analyzing the contents of the files. Byte order is almost always detected correctly.
buffersize =integer: specifies the size of the internal buffer in samples, for channel-based sounds.
precision: specifies whether sound data should be handled using single or double precision internally.

Methods

append (binary-string)

Not yet implemented.

cget (option)

Retrieves the value of an option for the sound. The possible options are listed above. It is also possible to access the options by treating the sound object as a dictionary, i.e., the following two expressions are equivalent:

mysound["encoding"]
mysound.cget("encoding")

concatenate (othersound)

Concatenates the sample data from othersound to the end of this sound. The sounds must be of the same type, i.e., have the same sample rate, sample encoding format, and number of channels. This command applies to in-memory sounds only.

configure (option=value ...)

Sets the options for the sound. The possible options are listed above. It is also possible to access the options by treating the sound object as a dictionary, i.e., the following two expressions are equivalent:

mysound["byteorder"] = "littleEndian"
mysound.configure(byteorder="littleEndian")

configure may be abbreviated as config.

copy (othersound)

Copies sample data from othersound. Optionally a range of samples to copy can be specified using the start and end options. Any active play operation is stopped before the command is executed if the format of the new sound differs from the current. This command applies to in-memory sounds only.

crop (start=n, end=n)

Crops the sound to the given range [start..end], i.e., all samples before and after these limits will be removed. This command applies to in-memory sounds only.

data (variable=tclVariable)

Not yet implemented.

destroy( )

Removes the Tcl command associated with this sound and frees its storage.

dBPowerSpectrum ( )

Computes the log FFT power spectrum of the sound (at the sample number given in the start option) and returns a list of dB values. See the section item for a description of the rest of the options. Optionally an ending point can be given, using the end option. In this case the result is the average of consecutive FFTs in the specified range. Their default spacing is taken from the fftlength but this can be changed using the skip option, which tells how many points to move the FFT window each step. Options:

start
end
fftlength
windowlength
windowtype
skip
channel
preemphasisfactor

filter (filter=filterobject)

Applies the filter to the sound. This command applies to in-memory sounds only.

flush ( )

Removes all audio data from the sound. This command applies to in-memory sounds only.

info ( )

Returns a string with information about the sound. The elements of the string are [length, rate, max, min, encoding, channels, fileFormat, headerSize].

insert (sound=othersound, position=sample ...)

Inserts othersound at position sample. Optionally a range of samples to copy can be specified, using the start and end options. This command applies to in-memory sounds only.

length ( )

Gets the length of the sound. With an additional numeric argument, it will set the length of the sound. The unit option specifies whether the sound should be measured in "SAMPLES" (the default) or "SECONDS". If the new length is larger than the current length, the sound is padded with additional silence.

max ( )

Returns the largest positive sample value of the sound. A range of samples to be examined can be specified with the start and end options. The channel to be examined can be specified with the channel option. The default is to check all channels and return the maximum value.

min ( )

Returns the largest negative sample value of the sound. A range of samples to be examined can be specified with the start and end options. The channel to be examined can be specified with the channel option. The default is to check all channels and return the minimum value.

pause ( )

Pauses the current play/record operation. The next pause() invocation resumes play/record. If there is a number of instances playing of a sound object, all of them are paused.

pitch ( )

Returns a list of pitch values computed using the AMDF method. The values are spaced 10 ms. A range of samples can be given using the start and end options. If a frequency range of valid pitch values is known, this can be specified using the maxpitch and minpitch options.

play ( )

Plays the sound. All options are ignored if play() is used to resume a paused play options. If a play() command is issued while another one is in progress, the latter one is queued up and starts to play as soon as possible. The lag before this new sound is audible can be controlled using the audio.latency() command.

For in-memory sounds, a number of options are available.

start end	specifies a range of samples to play
output	can specify any of the possible output ports returned by the audio.outputs() command
blocking	specifies whether playback should be asynchronous or not, i.e., if it is to be played in the background or it the play() command should return only after the sound has been played.
command	specifies a command to be executed when the end of the sound is reached
device	selects which audio device to use
filter	specifies a filter which is to be applied during output
starttime	schedules the start of playback (in ms) relative to a previous play operation

read (filename)

Reads new sound data from a file. Current supported file formats are WAV, MP3, AU, SND, AIFF, SD, SMP, CSL, and RAW binary. The command returns the file format detected. It is possible to force a file to be read as RAW using by setting the option fileformat=RAW. In this case, properties of the sound data can be specified by hand, using the rate, channels, encoding, skiphead, byteorder, and guessproperties options, as described above.

record ( )

Starts recording data from the audio device into the sound object. You may use the input option to specify one of the available input ports (as returned by the audio.inputs() command) and the device option to select which audio input device to use.

For in-memory sounds, the append=1 option specifies that the new audio data should be appended to the end of the existing sound instead of replacing it.

For channel-based sounds, the fileformat option can be used to specify the file format to be used when writing data, since there is no filename to infer the format from.

reverse ( )

Reverses the sound. A range of samples can be specified with the start and end options. This command applies to in-memory sounds only.

sample (sample)

Gets the value of the specified sample number. Sets the value with an additional numeric argument. When setting samples, one value should be specified for each channel you want to change. Some examples of setting:

# Sets the 1000th sample to 0 (of a mono sound)
mysound.sample(1000, 0)

# Sets both channels of a stero sound
mysound.sample(1000, 0, 0)

# Sets only the left channel, leaves right channel unchanged
mysound.sample(1000, left=0)

# Sets only the right channel, leaves left channel unchaged
mysound.sample(1000, right=0)

stop ( )

Stops the current play or record operation. If there is a queue of sounds to play, each of them can stop playback using stop(). If a callback was registered using the command option to play(), it is not executed.

write (filename)

Writes sound data to a file. A range of samples to save can be specified using the start and end options. The file format is guessed from the filename extension, but the guess can be overridden with the fileformat option. If you specify RAW file format, the sound will be saved to file without a header and using the natural byte order of the machine (overrideable with the byteorder option).

The audio object

The audio object gives access to various properties of the available audio devices. It is created automatically by initializeSnack.

Methods

encodings ( )

Returns a list of supported sample encoding formats for the currently selected device.

rates ( )

Returns a list of supported sample rates for the currently selected device.

inputDevices ( )

Returns a list of available audio input devices.

playLatency ( )

Sets/queries (in ms) how much sound will be queued up at any time to the audio device for playback. A low value makes new sound reach the loudspeakers quickly at the risk of gaps in the output stream. An appropriate value should be chosen with regard to processor speed and load.

pause ( )

Toggles between pause/play for all playback on the audio device.

play ( )

Resumes paused playback on the audio device.

play_gain ( )

Returns the current play gain value if invoked without a parameter. If an integer value is given, play gain is set to the given value. Valid values are in the range 0 to 100.

outputDevices ( )

Returns a list of available audio output devices.

record_gain ( )

Returns the current record gain value if invoked without a parameter. If an integer value is given, record gain is set to the given value. Valid values are in the range 0 to 100.

selectOutput (device)

Selects an audio output device to be used as default.

selectInput (device)

Selects an audio input device to be used as default.

stop ( )

Stops all playback on the audio device.

The mixer object

The mixer object gives access to various properties of mixer devices, such as input/output jack, supported ports, mixer lines, and gain. It is created automatically by initializeSnack.

Methods

channels (line)

Returns a list with the names of the channels for the specified line.

devices ( )

Returns a list of available mixer devices.

input ( )

Gets/sets the current input jack. You can optionally give a boolean Tcl variable as an argument.

inputs ( )

Returns a list of available input ports.

lines ( )

Returns a list with the names of the lines of the mixer device.

output ( )

Gets/sets the current output jack. You can optionally give a boolean Tcl variable as an argument.

outputs ( )

Returns a list of available output ports.

update ( )

Updates all linked variables to reflect the status of the mixer device.

volume (line)

Return the current volume setting for mixer. You can optionally link a Tcl variable to the value by including it as an argument. If you link two Tcl variables, they are used for the left and right channels respectively.

select (device)

Selects a mixer device to be used as the default.

The Filter class

Filter objects can interact with sound objects either during playback or by using the filter() command of the sound object.

Filters in Snack are still in an early stage of development. Consult the Snack documentation for further details.

The SnackCanvas class

SnackCanvas is a subclass of Tkinter.Canvas that has three additional kinds of canvas items: waveforms, spectrograms, and sections (power spectra).

Waveforms

Draw waveform items on the canvas using the create_waveform method. Obligatory arguments are the x and y coordinates of the waveform's top-right corner. Options are:

anchor	works as for ordinary Tk canvas items
channel	selects which channel to show for multi-channel sounds. Use "left", "right", "both", "all", -1 (all), or a channel number counting from 0 (left).
end	select the end-point of the time-range to draw
fill	works as for ordinary Tk canvas items
frame	boolean value controlling whether a frame will be drawn
height	the height of the waveform
limit	specifies the maximum shown value for the sound amplitude
pixelspersecond	determines the scaling factor in the x direction, which also gives the width. If both width and pixelspersecond are specified, the waveform will be cut at one end depending on if a start or end option was also given.
shapefile	specifies a file for storing/retrieving precomputed waveform shape information
sound	specifies which sound object to link to
start	selects the starting point of the time-range to draw
stipple	works as for ordinary Tk canvas items
subsample	useful for large sounds to specify how precisely they should be analyzed for the shape calculation. The default value 1 uses every sample in the sound to draw the waveform envelope, which can be slow for large sounds. A value of 10 uses every tenth. Care should be used when specifying values. Using large values may lead to incorrect envelope shapes.
tags	works as for ordinary Tk canvas items
width	width of the waveform. See the entry for pixelspersecond for what happens if you specify both options.
zerolevel	specifies whether a line will be drawn for the zero amplitude level.

Spectrograms

Draw a spectrogram of a sound on the canvas with the create_spectrogram method. Obligatory arguments are the x and y positions of the top-right corner of the spectrogram. Options are:

anchor	works as for ordinary Tk canvas items
brightness	takes a value between -100.0 and 100.0
channel	selects which channel to show for multi-channel sounds. Use "left", "right", "both", "all", -1 (all), or a channel number counting from 0 (left).
colormap	takes a list of colours as parameter. At least two must be specified. The first colour is used for the lowest intensity in the spectrogram. An empty list ives the default 32-level grey scale.
contrast	takes a value between -100.0 and 100.0
end	gives the end-point of the time-range to be drawn.
fftlength	specifies the number of FFT points (8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096).
gridcolor	specifies the colour of the grid.
gridfspacing	the spacing between frequency markers on the y-axis in Hertz. The default value of 0 means no grid.
gridtspacing	the spacing between the time markers on the x-axis in seconds. The default value of 0 means no grid.
height	height of the spectrogram.
pixelspersecond	determines the scaling factor in the x direction, which also gives the width. If both width and pixelspersecond are specified, the spectrogram will be cut at one end depending on if a start or end option was also given.
preemphasisfactor	specifies the amount of preemphasis to be applied to the signal prior to the FFT analysis.
sound	specifies which sound to link to.
start	the starting-point of the time-range to be drawn.
tags	works as for ordinary Tk canvas items.
topfrequency	the frequency value at the top of the spectrogram.
width	width of the spectrogram. See the entry for pixelspersecond for what happens if you specify both options.
windowtype	"hanning", "hamming", "bartlett", "blackman", or "rectangle"
winlength	specifies the size of the hamming window, which should be equal to or less than the number of FFT points.

Currently spectrograms have a limit of 32767 pixels.

Sections (power spectra)

Draw an FFT log power spectrum section of a sound on a canvas with the create_section method. Obligatory arguments are the x and y coordinates of the top-right corner of the section. Options are:

anchor	works as for ordinary Tk canvas items
channel	selects which channel to show for multi-channel sounds. Use "left", "right", "both", "all", -1 (all), or a channel number counting from 0 (left).
end	gives the end-point of the time-range to be drawn.
fftlength	specifies the number of FFT points (8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096).
fill	works as for ordinary Tk canvas items
frame	specifies whether a frame will be drawn
height	height of the spectrogram.
maxvalue	specifies the top of the range (in dB) which will be shown. The default range is 0.0 to -80.0 dB.
minvalue	specifies the bottom of the range (in dB) which will be shown. The default range is 0.0 to -80.0 dB.
preemphasisfactor	specifies the amount of preemphasis to be applied to the signal prior to the FFT analysis.
skip
sound	specifies which sound to link to.
start	the starting-point of the time-range to be drawn.
stipple	works as for ordinary Tk canvas items
tags	works as for ordinary Tk canvas items.
topfrequency	the highest frequency value shown for the section.
width	width of the spectrogram. See the entry for pixelspersecond for what happens if you specify both options.
windowtype	"Hamming", "Hanning", "Bartlett", "Blackman", or "Rectangle"
winlength	specifies the size of the hamming window, which should be equal to or less than the number of FFT points.

Putting SnackCanvas items on regular Canvases

It's possible to draw these new canvas items onto any canvas in your program, not just those that are instances of SnackCanvas. You might need to do this if you're using elaborations or subclasses of Canvas that have been written by other people, for example, if you want to draw a waveform on a ScrolledCanvas from the Python Megawidget collection.

To accomplish this, tkSnack provides module-level versions of create_waveform, create_section, and create_spectrogram. Simply use the non-Snack canvas as the first argument. Instead of:

NonSnackCanvas.create_waveform(sound=tada)

use:

tkSnack.create_waveform(NonSnackCanvas, sound=tada)

Last updated Mon 23 Jul 2001 13:21:20 BST