DirectSound Capture Using Deviare.

2009 February 24th | By pipaman | Comments (4) | Permalink

Under: examples - opinion

Download Deviare
Download Sourcecode
Download PDF

Contents

Introduction

Today we are going to see how easy it can be to capture audio with Deviare. From players like Windows Media Player, instant messaging applications like Skype & Windows Live Messenger, to any application using DirectSound. The wave output will get captured by us.

Deviare is indeed a powerful framework. Built to resolve most complex tasks in the simplest way. With a few lines of python, all our hooking is done and running. Today performance is extremely important, yet Deviare proves itself as the best. It allows you to also take advantage of the full power of Python

Python

Research

Direct Sound Capturing

I must be honest, I’ve never used the DirectX API in my life, so I was a bit uncertain of how difficult this could be.
I started by looking at MSDN documentation onIDirectSound and IDirectSoundBuffer.
The first goal was to find a safe place to read its sound buffers. I found out that IDirectSoundBuffer::Unlock could serve my intentions well. At this point, the user is telling DirectSound that he has finished writing his wave output and the locks may be released. So, if we step in between, we can safely read the buffer. The user is no longer writing to it, and DirectSound has not yet taken control of it.

I tested it on many applications and it turned to be the right choice.
It works perfectly for WMP, Windows Live Messenger, and many others.
No problems showed up until I stepped with Skype…


Monitoring Skype Conversations

This might be the way many applications handle their sound output, but it was the first application I have seen and I named the case after it.
Later, I found a few articles describing it in detail.

Skype

So, to my surprise, I was not seeing any data being written to the sound buffers after the unlock was called. How the hell is it writing its wave, and how am I supposed to read it?!. It kept me thinking for a while, until I noticed an interesting and constant call to IDirectSoundBuffer::GetCurrentPosition. Then I realized that this writing method depends on constant reading of the play and write buffer pointers. That’s because DirectSound, as most stream based implementations, works with a Circular buffer. Capturing its wave output requires that we keep track of changes in the write pointer. Once we know it has moved forward in the buffer, we can read its steps. Since here we don’t know how much the user has actually written to it, we must know the full size and location of the buffer. Unless we want to read garbage, but I’m sure that’s not the case ;).

Implementation


Deviare Python wrappers

Before we get our hands dirty, let me introduce you to something new in Deviare: Python Wrappers. As you already know Deviare is exposed through a series of COM interfaces. To save ourselves from the work of writing a whole new set of bindings, we used the well known project PyWin32. It’s very friendly to be used directly, as you may see in py_deviare_objects.py, just not enough to me. So I built these wrappers on top of it and made them as transparent as possible.
You’ll find the use of the interface very similar to the way it’s done in our C# examples and in compliance with the python way of life, of course.Python Code

Wave Tools

I wrote these tools to help me write down the captured wave data. This may be obvious to people working in audio projects, but for me I cannot believe there is no native support in Windows to read-write Wav files! Yes, there is native support to write RICH content but come on!

Luckily for me, I found a small sample C++ class inside the DirectX SDK. This was good enough for me to write my own in Python. As you may see, my WaveFile class only supports the write operation. Though, adding a read member should be easy enough for you :). I have also added a lock to it, to ensure our data does not get corrupted by multiple thread operations. You may use it safely.

The structures used were defined exactly as found in DirectSound and WinMM headers. Some of them are used by DirectX to specify the format of the wave content.

COM Type Libraries

By default, DirectX installations do not register their library types. Since we need that information, so Deviare can hook them, I created my own definitions with the interfaces we are interested in. To prevent any collision with previous installations, I used a different GUID. There is a python script that takes care of its registration and it’s automatically ran on demand by our example. Again, definitions are exactly as found on DirectX SDK.

Directsound

Virtual Table Finder

To obtain the virtual tables for the interfaces, we basically have two options. Either we wait for its instantiation by the target process, or we find them ourselves on our own. Our first option is known to work for sure, yet we delay our installations until these events rise. This may also place us in a race and we may not capture all the output. The second one allows us to hook our targets immediately. Yet, in this case, we depend on the library (dsound.dll) being loaded in the same address space of our target. I have placed the two options in our example. If the current one is not working for you, uncomment the other at py_deviare_directsound.py.

Hooking Direct Sound

The first thing we need, is to know every time a sound buffer is created.
For that we are going to intercept calls to IDirectSound::CreateSoundBuffer.
If the calls succeed, we look-up the table location inside the returned instance.

From there we are going to hook four members of IDirectSoundBuffer:Initialize, SetFormat, Unlock and GetCurrentPosition.
The first two, are used to obtain the wave format that the user is writing to the buffer. We also need to watch details from the call that creates the sound buffer, in case it is specified there.

The Unlock member, as our research told us, is used in most applications to notify DirectX that the buffer is written and ready to be played. So we read the buffer pointers and size, to use Deviare’s memory interface to copy all content. We need to be careful, and see if the call actually succeeded. Only then can we save the wave data, else we must discard our buffers.

With applications that keep track of the play and write cursors, we are going to monitor their calls to GetCurrentPosition.
As explained earlier, with this method, I need to know the full size of the buffer and its location. So I save it from the first call to Unlock. Then I virtually divided my buffer in N segments, and filled it with the wave data as the write cursor moves forward in the buffer. Once my buffer contains enough data, I write it down to the wave file. To prevent false positives, in the creation of sound buffers, I delay the creation of my file until I have real data stored.

In case we are monitoring the creation of IDirectSound in the target process, we also need to hook DirectSoundCreate and DirectSoundCreate8 from dsound.dll.
There we can obtain the virtual table for IDirectSound, and follow our quest.

Running Sound Capture

Sound Capture

Easy Steps

Execute the run_me.py located among the deployed files, and you’ll be prompted with a window to type the complete name of the process you want to start monitoring. For example: Skype.exe. Once the program starts capturing, the wave files will be written in the same folder.

Once you are done, click OK on the dialog box to stop recording. Now you can open the .wav files generated, and listen the capture. Do not open them before closing the example as the data may not be readable by then.

Registration

The first execution of our example, will automatically register its interfaces and data types. It will also generate a file labeled .deviare_types_registered to prevent registration on the following executions. You can safely remove the file at any time you want the registration to be run again.

What’s Next

Optimizations

At any point of our handling, performance is essential. Any delay is highly punished by the sound output. So we must be careful about any operation we do inside the function call. This example tries to cache enough data before doing a write operation to disk. In case you need to improve its performance, you should read the data and release the call as soon as possible. Then in a different thread, or in a non punitive call, flush our data to the wave file.

Wave API hooking

This example could be very easily adapted to capture wave data from applications using WinMM API. Most browsers, Flash, and Google Talk use it to throw their sound output.


Hook DirectSoundCapture And Listen To Full Conversations

You should have noticed, when capturing from Skype, that your own voice is not heard. That’s because the application is not echoing its capture from the microphone. To get that too, it is necessary to hook IDirectSoundCaptureBuffer and proceed the same way to read its buffer.


Inspect More COM Interfaces

If you want to discover a lot more about the internals of DirectSound, then Deviare will be very valuable for you. Inspecting COM object is very easy indeed. Simply define one of its interfaces (if its not already registered in the system) and hook them the simpler way.

If you are wondering what other interfaces may be useful for you, try our Deviare COM Console to discover them. It comes with source code, and you are free to adapt it to your needs!

And That’s All Folks, hope you find it useful, enjoy!

Deviare Message Spy

2009 February 18th | By sswan | Comments (0) | Permalink

Under: C# - Deviare - opensource - products - programming

Download messagespy_demo.zip - 250 KB

Download messagespy_src.zip - 249 KB

Deviare Message Spy

Deviare Message Spy

Contents

Introduction

This article presents you with a different perspective of how to inspect window messages, to see how applications are communicating and managing their controls. We are not going to explain what window messages are or what they are used for in this article, so we suggest that you read these excellent articles to understand them: Handling Window Messages (Part 1, Part 2, Part 3). In this article we are going to monitor the Message API from the inside by hooking the target process.

So, what’s the good news?

As a first step when developing, to inspect windows, we open the Spy++ application and start the tedious work of following messages as they are printed in their hundreds. This is helpful most of the time, as we usually want to know what our windows are seeing and receiving. Yet, what happens when we want to know exactly how an application is communicating with its controls (what calls it makes to the message API) or want to see if our messages are getting filtered by someone else? As you may know, Spy++ installs 3 global hooks to receive every Send, Post and Call to a window message handler. The information provided by these methods is not enough to know what messages are coming from our application or if any of them have been filtered by a hook installed earlier in the call chain.

Do not panic, Deviare comes to rescue. What we are going to do is intercept all the Message APIs from the process that the window belongs to and monitor its calls. From there, we can be sure of what messages are being sent from the application to its controls and if any of them are missing from the ones that Spy++ is reporting, then we will know if someone else is watching us…

What happens with the messages not known by Spy++? How are we going to see them? Look what happens with many of the messages used by the standard ListView in windows. Spy++ does not know anything about them if the window is subclassed (for example ATL:SysListView32), and cannot trace its content. Try following LVM_GETNEXTITEM in Outlook Express and you will only see unknown 0×100C messages. The same goes for custom user messages that you may know and want to follow. We need an application that can be customized to our needs!

Deviare Message Spy

To probe our theory, we have built this message spying application. We have added to it a way to lookup windows handlers, hook the process owning it, and correctly report the messages and structures.

Finding a Window: The Spy++ Style Window Finder

To pick the target window and the process we wanted an interface like the one used in Process Explorer and Spy++. Thanks to Mark Belles this was an easy task. He has a great article on how to implement a nice Window Finder, in Code Project.

Selecting a window selected window info

Hooking

In order to install a hook, first we need to identify our target process. After obtaining a window handle from our Window Finder, we can use GetWindowThreadProcessId to identify which process owns the window. From there we use the .Net API to access it and tell Deviare which process we wish to hook.

Win32.GetWindowThreadProcessId(hWnd, out _processId);
_txtProc.Text = Process.GetProcessById(_processId).MainModule.ModuleName

For our monitoring we have divided the API in 2 sets: the Dispatch group, and the Sent and Post group. Monitoring messages that arrive to the first group will provide us with a very similar view of what Spy++ sees. This is because these messages arrived to the application and have not been filtered by any hook. With our second group, we will identify direct and asynchronous calls to the Message API.

Let’s see how we install the hook for one of these functions:

procs = _mgr.get_Processes(0);
procs = _mgr.get _Processes(0)
proc = procs.get_Item(_processId)
IPEModuleInfo mod = proc.Modules.get_ModuleByName("user32.dll");
IExportedFunction fnc = mod.Functions.get_ItemByName("PostMessageW");
_hook = _mgr.CreateHook(fnc);
_hook.Attach(proc);
_hook.OnFunctionCalled += new Deviare.DHookEvents_OnFunctionCalledEventHandler(_hookPst_OnFunctionCalled);
_hook.Properties = (int)DeviareCommonLib.NktHookFlags._call_before;
_hook.Hook();

As you see, we easily pick our target process by Id and select its Module and Function by name. The module name is not important, as it is always going to be “user32.dll”. If you have doubts, you can use Spy Studio to watch the process modules and exported functions.

Once the hook gets installed, we will receive notifications on our handler. From there we parse the function parameters transparently with the interface provided. (These parameters are actually in the target process, and Deviare copies them to our process on our demand and handles all the communication).

int returnVal = callInfo.ReturnValue;
IParams pms = callInfo.Params;
IEnumParams enm = pms.Enumerator;
IParam pm = enm.First;
IParam recvMsgHndl = pms.get_Item(0);
IParam recvMsgParam = pms.get_Item(1);
IParam recvWParam = pms.get_Item(2);
IParam recvLParam = pms.get_Item(3);

After reading all the data we require from the call, we will use our generated Xml to identify the message and properly cast it to its structure and show it properly.

The XML

The XML document in this application was created specifically to link together the message names, values and parameters. As messages like WM_LBUTTONDOWN are predefined as 0×201 we can place this in a XML file containing information on the parameters WPARAM and LPARAM.

<message value="0x201">
<name>WM_LBUTTONDOWN</name>
<return value="">
<returninfo></returninfo>
<returnmisc></returnmisc>
</return>
<wparam value="">
<wname>wParam</wname>
<wmisc>wParam Indicates whether various virtual keys are down. This parameter can be one or more of the following values.
MK_CONTROL
The CTRL key is down.
MK_LBUTTON
The left mouse button is down.
MK_MBUTTON
The middle mouse button is down.
MK_RBUTTON
The right mouse button is down.
MK_SHIFT
The SHIFT key is down.
MK_XBUTTON1
Windows 2000/XP: The first X button is down.
MK_XBUTTON2
Windows 2000/XP: The second X button is down.</wmisc>
</wparam>
<lparam value="">
<lname>lParam</lname>
<lmisc>lParam
The low-order word specifies the x-coordinate of the cursor. The coordinate is relative to the upper-left corner of the client area.
The high-order word specifies the y-coordinate of the cursor. The coordinate is relative to the upper-left corner of the client area.&amp;amp;amp;amp;lt;/lmisc&amp;amp;amp;amp;gt;
</lparam>
<misc></misc>

We could not find any database with this information, so we generated an XML document with the messages that we were interested in knowing about. As you can see, it is easy to simply add any message you want. In the process of building this XML, we used a very nice tool called ApiViewer from ActiveVB.de. Just search for the message names you want and you can evaluate the message values from the names.

The Cast

Now that we can identify the structures used on messages, we need to tell Deviare. Basically we are telling it to interpret our parameter, not as a simple LPARARM or WPARAM type, but as the complex structure we know is there. This is the case for messages like WM_DRAWITEM. So, to read it’s structure contained within the LPARAM we need to cast it as follows:

IParam pm = pms.get_Item(2); //LPARAM
pm = pm.CastTo(“LPDRAWITEMSTRUCT”); //Now our IParam is read as a pointer to DRAWITEMSTRUCT
pm = pm.Evaluated; //Resolve the pointer indirection
//Ready to use IParam as the structure sent by the OS.

It is possible to do this with all of the structures you can find defined in the windows headers. So, you should be able to cast and read any of them that are used in within these messages.

Using Deviare Message Spy

Deviare Message Spy in Action

Deviare Message Spy in Action

Above we have our Deviare Message Spy in action. We selected the contacts list window from Outlook Express (at the bottom left) to spy on. You can see all the message values that were sent via Post and Send Message APIs. LVM_HITTEST has been expanded to show the full values received. As LPARAM is a pointer to the LVHITTESTINFO structure we can find all relevant information contained within.

Hope you enjoyed this article, and found it useful. Let us know what you think!

Known Issues

Many messages have the same Hex Address, such as TB_GETITEMRECT and TTM_UPDATE. Both of these messages have the value of 0×41d but are very different messages.

The TTM_UPDATE Message Forces the current tool to be redrawn. It does not use the wParam and lParam where as TB_GETITEMRECT Message Retrieves the bounding rectangle of a button in a toolbar.

TB is a Toolbar message and TTM is a Tooltip message. As our Spy++ style window finder already finds the window class, such as SysListView32 and ToolbarWindow32, It would be easy to use the class name to tell the program with Xml message is the correct one.

Resources