Goto page 1, 2, 3 Next
| View previous topic :: View next topic |
| Author |
Message |
rokonoko |
Posted: Tue Feb 12, 2008 10:22 pm Post subject: Capturing Closed Captions (CC) Through 1seg Broadcasts Post Rating: 0 |
|
|
Here is the description of the method which I use to extract closed captions from Japanese TV broadcasts. Also there are some practical hints for those who want to apply the method themselves.
Theory
There are several high definition broadcasting solutions in Japan. The majority of dramas tracked at this server comes from them. Someone somewhere records the broadcast of a drama and converts it into a format suitable for PC media players. The only things which get transferred to PC are video and audio. But broadcasts usually contain other types of data which get lost in the process of conversion. And closed caption (CC) is one of them. While it is possible to record the raw media stream, it is hard to get CC from it because the stream itself is scrambled. I personally don’t know if anyone was able to find a way to descramble it in a controlled environment i.e. on PC.
But there is also one broadcasting format which usually gets overlooked. Its name is “1seg” (OneSeg). It is a format for mobile media (cell phones etc.). The quality is not high but good enough for mobile devices. The broadcasted content is almost an exact copy of one of the aforementioned high definition broadcasts. Except for the scrambling. And this is good because one can lay his hands on the raw media stream and rip CC (theoretically). But on practice it is not that easy. It is not a problem to record 1seg broadcast – there are plenty of 1seg oriented TV tuners for PC. The only problem is that all manufacturers make their tuners encrypt the recorded data for copyright control.
So far so good. Or bad. It seems like there is no way to get CC without defeating some sort of protection (which is also highly illegal, of course). But there is good news – with 1seg tuner for PC one gets to see CC on his PC screen. And one usually can capture them as an image and even OCR them. Closer, but still too tiresome - no one is going to do it (not me for sure ; ). Luckily, there is an easier way. And this way lies in the answer to the question: how does CC get to my screen?
With all 1seg tuners comes a piece of software (media player) which is used to watch TV programs received by a tuner. This player renders video, decodes audio and, yes, displays CC (be careful though, some 1seg tuners don’t support CC at all). So in this player must exist some mechanism used to display CC. Software engineers are for the most part lazy people (in a very constructive way of course): they like to reuse already existing mechanisms instead of making their own. And there is only one convenient such mechanism in win32 operating system for displaying Japanese text. Its name is Windows API function DrawTextW, or just DrawTextW. Of course, not all players use it – some might implement their own mechanisms. But there are also those which use DrawTextW to its full extent.
What does it give us? Actually it gives us quite a lot. There are plenty of programs which let us monitor (spy) and log the usage of Windows API functions (including DrawTextW) and even log parameters passed to them (for example text of CC displayed by DrawTextW). By using one of such programs I manage to extract CC in real time while recording the broadcast. It really takes minutes to make subs from CC and sync them with an existing video file (but it takes time to get RAWs of course).
Practical hints
First, you need a good API spy program. I use WinAPIOverride32 and find it to be very good. It can automatically hook itself to the TV tuner player process, log API calls and export results to an XML file. It also logs function call times (all the way down to milliseconds) which gives a basis for sync times. Just subtract the first displayed CC time from all the calls times and you get relative CC time stamps. Be sure to make an appropriate monitoring file including only DrawTextW function (USER32.DLL) or you’ll get drowned in different API calls.
Next, you need a 1seg TV tuner which comes with a player program suitable for this method. The player program must use DrawTextW to display CC and must not be protected against API spies. Check section “Confirmed tuners and software” for examples.
After you get the log of DrawTextW function calls with text data you need to convert it to the subtitles format. Sometimes you’ll have to filter the log for excessive DrawTextW calls (some players use it not only to display CC). Also, some CC lines might be displayed by several consecutive DrawTextW calls. You’ll have to find a way to combine them. Their display times are usually only dozens of milliseconds away from each other. Also the duration information is not present. Because of that you’ll have to estimate it yourself. For example: you make subs which last 5 seconds or till the next line is displayed (exactly how the player in my configuration behaves). It is a good idea to automate the process – scripts, custom soft etc.
XMLSubParser
(http://www.sendspace.com/file/yyulhw)
XMLSubParser is a custom program which parses the output of WinAPIOverride32, filters CC texts and outputs subtitles in srt file format in Unicode 16 bit Little Endian encoding. It works fine with output generated by WinAPIOverride32 spying on PCastTV for ワンセグ, Ver: 2.14, but also may work with some other player software.
XMLSubParser splits srt files at points where commercials are present. This makes it easier to sync the subs to a video file.
Usage:
XMLSubParser.exe filename mult
filename is an XML file name. If it includes spaces then put it in quotes (double).
mult is a time speed factor. If you where recording subs at double speed then make it 2. At normal speed (without fast forward) it should be 1.
The program is not very well written. It is not fool-proof and doesn’t include any help. It surely has bugs, which I try to catch. So, if you have a problem with parsing anything then post the XML file in question in this thread. I’ll try to make it work ; ) Overall, use the program at your own risk.
Confirmed tuners and software
Here is the list of configurations for which my method (monitoring DrawTextW function calls) works. If you have a 1seg TV tuner not listed here and tried the method, please, post a comment in this thread describing your experience. I’ll add your configuration to the list.
1. My configuration
1seg TV tuner model: Buffalo DH-KONE/U2V
Player Software: PCastTV for ワンセグ, Ver: 2.14
Status: works
2. Submitted by siantut
1seg TV tuner model: Buffalo DH-KONE/U2
Player Software: PCastTV for ワンセグ, Ver: 2.04 upgraded to 2.14 from Buffalo's OHP
Status: works
Last edited by rokonoko on Sun Feb 17, 2008 12:43 am; edited 2 times in total
|
|
| Back to top |
|
 |
jholic Joined: 03 Feb 2004 Total posts: 6110 Location: missin' hawaii Gender: Male |
Posted: Wed Feb 13, 2008 12:43 am Post subject: Post Rating: 0 |
|
|
this is an excellent tutorial. i would label it, but i don't have enough room.
i only wish you had arrived sooner. i remember about two years ago, we had two hearing-impaired members asking about cc and subs. at the time, we did have either. hopefully, they are still around to enjoy the fruits of your labor._________________ Use THUMBS UP/DOWN buttons to KEEP or DELETE posts.
If you see SPAM or ABUSE, use the REPORT button.
|
|
| Back to top |
|
 |
siantut |
Posted: Sat Feb 16, 2008 10:15 am Post subject: Post Rating: 0 |
|
|
Rokonoko, thanks for the great tutorial.
I had never known that CC could be extracted.
I happened to buy 1seg TV tuner in September last year when I broke my PC, mine is older than your version I think.
It is:
1seg TV tuner model: Buffalo DH-KONE/U2 (without the V like yours)
Player Software: PCastTV for ワンセグ, Ver: 2.04 upgraded to 2.14 from Buffalo's OHP.
(they are now having version 2.15 Beta)
I tested it with WinAPIOverride32 and yes it works alright. Maybe you can add it to the list.
And I wonder if it is also useful to list things that don't work.
I have tested it with my TV Tuner card IO Data GV-MVP/GX2-GX2W MagicTV and it doesn't work.
Have you ever tested it with a 1seg mobile phone? I have 1seg phone which also shows the CC subtitles and the TV shows can be recorded on the phone then transfered to PC by microSD and opened using a compatible s/w but I have never tried it yet. I think it is protected too by copy-right protection.
Could you please post the program you wrote to convert the logs to srt? I can help posting some stuff in srt format and it will speed up my subbing process too. Thank you in advance. _________________ Bye bye and thanks for everything.
|
|
| Back to top |
|
 |
rokonoko |
Posted: Sun Feb 17, 2008 12:42 am Post subject: Post Rating: 0 |
|
|
@ siantut
Great!
I've added your configuration to the list.
As for IO Data GV-MVP/GX2-GX2W MagicTV, I've checked the manufacturer's site and it seems that it's not a real TV tuner. It requires an external receiver and actually re-encodes the signal. Am I right?
I don't have a 1seg compatible mobile so I cannot test it. But it seems to be an interesting idea. Try it if you can.
Here is the program you've asked: http://www.sendspace.com/file/yyulhw. I've added the description to the tutorial.
|
|
| Back to top |
|
 |
siantut |
Posted: Sun Feb 17, 2008 10:26 am Post subject: Post Rating: 0 |
|
|
@Rokonoko
Your XML parser works GREAT!! Thank you for writing it and sharing it!
About IO Data GV-MVP/GX2-GX2W MagicTV, actually it is a real TV Tuner (double tuner cards), we don't have to connect it to anything (except to TV antenna) to watch analog TV, but yes in order to catch digital/HDTV, it needs to be connected to an external digital tuner. In my case, I connect it to the digital tuner in my DVD recorder, that's why I can catch HDTV shows with CC. I was just hoping that I could catch the CC shown on the TV using WinAPIOverride32 but it doesn't work. I also thought perhaps the shows in analog TV contain hidden CC, but as MagicTV refuses to be spied on, it doesn't show anything at all.
And yes I will try to see if I can catch the CC from my keitai too. I will post it here whether it is possible or not. Maybe not.
Please let me know what things I can contribute now that I can do this, I will gladly help. Meanwhile, I will still try/experiment a few things. If I can find something good, I will post it here.
Thanks again. _________________ Bye bye and thanks for everything.
|
|
| Back to top |
|
 |
Sapporo Girl Joined: 18 Apr 2007 Total posts: 211 Location: Sapporo Gender: Female |
Posted: Tue Feb 19, 2008 1:08 am Post subject: Post Rating: 0 |
|
|
You guys rock!!! Because I'm not good such technical things, I was on the verge of offering to buy the equipment for someone else if I knew they could get the signal and they were willing to share with our members. Siantut, when you're up and running to your satisfaction, please let us know. If I have the Japanese script and Tianj's timing, I can complete my subs much faster and our members will be happier. I'm looking forward to having this extra resource for the Spring 2008 season.
Thank you so much!!!
|
|
| Back to top |
|
 |
rokonoko |
Posted: Tue Feb 19, 2008 5:02 am Post subject: Post Rating: 0 |
|
|
@Siantut
I'm glad that everything worked out fine : )
As for contribution, it would be great if you could provide subs for some dramas which I cannot record due to broadcast time conflicts. Some people already PMed me about this. Maybe, they've PMed you too ; ) Such dramas are listed in the http://www.d-addicts.com/forum/viewtopic_58128.htm post.
@Sapporo Girl
If you want to get Daisuki!! drama Japanese subs, Siantut is surely the one you want to ask for help.
|
|
| Back to top |
|
 |
canmield Joined: 23 Jan 2008 Total posts: 39 Location: Chicago Age: 7 Gender: Male |
Posted: Tue Feb 19, 2008 5:45 am Post subject: Post Rating: 0 |
|
|
I'm really glad that you understand how to do this and are willing to take the time and effort.
I second what Sapporo girl said. It's so much faster with a CC script, and just in time.. there's a character with a horrible accent on the show I'm working on now.
Really I can't thank you enough because, to be honest... I'll never be able to do that myself.
|
|
| Back to top |
|
 |
siantut |
Posted: Thu Feb 21, 2008 2:24 pm Post subject: Post Rating: 0 |
|
|
@Rokonoko, sorry for the late reply, I was busy running around Akihabara checking all 1seg gadgets :p
Yes I will gladly do that (I mean helping with japanese subs when you can't do it because of time clash). BTW, I was wondering, did you receive my PM like 10 days ago? May I PM you if necessary? Thanks.
As to "Daisuki" which Sapporo Girl needs, unfortunately it happens to clash with Shika Otoko's time-slot, and I am subbing Shika Otoko so I need to record Shika Otoko by myself too (to experiment with timings etc). To solve this problem, I bought another 1seg tuner today, exactly like the one you have, and managed to record both "Daisuki" and "Shika Otoko" at the same time. I am not sure how the result will be since I have not checked it very closely yet, but I should be able to post "Daisuki" japanese subs tomorrow, I think. I am dead beat today from too much walking.....
@Sapporo Girl: Do you need "Daisuki"? I will post "Daisuki" for you tomorrow at about the same time as now. But gomen if the timing is off.
@Canmield: Yes it is all possible thanks to Rokonoko! _________________ Bye bye and thanks for everything.
|
|
| Back to top |
|
 |
Sapporo Girl Joined: 18 Apr 2007 Total posts: 211 Location: Sapporo Gender: Female |
Posted: Fri Feb 22, 2008 3:16 am Post subject: Post Rating: 0 |
|
|
| siantut wrote: |
@Sapporo Girl: Do you need "Daisuki"? I will post "Daisuki" for you tomorrow at about the same time as now. But gomen if the timing is off. |
Siantut! I love you!!! I truly do!!! Thank you so much! It would make all the difference in the world to have the "Daisuki" script!!! I definitely need it. Don't worry about the timing. I'm hoping Tianj will still send the timing files. If she can't, I can make timing adjustments.
If there is anything you need from up here in Snow Country, please let me know. I'm serious because you went through the trouble of getting another gadget in addition to offering to do this in the first place. If you need snow, we've got plenty. To be honest, I'm not sure what else Hokkaido offers the rest of Japan, but I'm willing to look for it.
Thank you so much!!!
|
|
| Back to top |
|
 |
canmield Joined: 23 Jan 2008 Total posts: 39 Location: Chicago Age: 7 Gender: Male |
|
| Back to top |
|
 |
Sapporo Girl Joined: 18 Apr 2007 Total posts: 211 Location: Sapporo Gender: Female |
|
| Back to top |
|
 |
siantut |
|
| Back to top |
|
 |
Sapporo Girl Joined: 18 Apr 2007 Total posts: 211 Location: Sapporo Gender: Female |
|
| Back to top |
|
 |
siantut |
Posted: Fri Feb 22, 2008 2:00 pm Post subject: Post Rating: 0 |
|
|
Hihi, Sapporo Girl, really, I was just joking!! How can I ask anything? But I am really touched by your sincerity, thank you. It is the heart/thought that counts, so please don't worry about it.
And, could you please attach or send me the timings of episode 6 by Tianj? I have all the text, so I will put the Japanese sentences in their right places in the timings. My first try flops, because though I have managed to catch all the text, the timings are weird and they went back about 4 minutes after each CM session. So it is all messed up. If I have the timings, I can put those texts in the right time and the right order. Thanks in advance.
From episode 7, I will do it not by recording but by running the s/w in real time just like what Rokonoko is doing. I hope the CMs won't mess up the timings if it is done in real time. Don't worry, I can either drop extracting Shika Otoko, or I can install one tuner on my other PC. No problem. I just feel so bad to disappoint you with ep 6.
Ganbarimashou! I will upload ep 6 with the Japanese subs after I get the timings from you. Thanks!
EDIT: I already asked Tianj directly for the timings. Thanks, anyway. _________________ Bye bye and thanks for everything.
Last edited by siantut on Fri Feb 22, 2008 2:58 pm; edited 3 times in total
|
|
| Back to top |
|
 |
|
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You can download files in this forum
|
|
|
|
Powered by phpBB © 2001, 2002 phpBB Group • Forum skin developed by Volize
|
| |
|
|