Re: Fast 2D Blit and Fast Texture Upload
- From: "jbwest" <jbwest@xxxxxxxxxxx>
- Date: Tue, 17 Jul 2007 19:49:51 -0700
"sorcerer" <nagual.hsu@xxxxxxxxx> wrote in message
news:1184723575.003740.127330@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I use Linux/Xorg. I have 16 frame grabbers and each has to
output 30 frames per second to the system memory at a resolution of
720x480. And I want to display these 16 live videos on the screen at
arbitrary resolutions all in the same time. I've used glTexSubImage2D
and GL_QUADS to upload/blit every frame. Then I found the cpu usage
of my blitting program is quite high. After I did some googling, the
possible main reasons of the high cpu usage is listed in the
following:
1. Textures does not transfer directly from application
buffer(user space memory) to GPU's texture memory. The
graphics chip's kernel driver handles a texture copy from
user space to DMA buffer(kernel space) and then trigger
the DMA transaction to upload the texture.
2. The native internal texture format of the graphics chip is
not the same as that of the video data grabbed from the
frame grabber and thus pixel swizzling occurs( in the
driver level?).
But according to Nvidia's documents, to avoid pixel swizzling
and to eliminate extra texture copy, one must use BGRA format in the
system memory and use pixel buffer objects. But BGRA format means 4
bytes per pixel and there are 16 frame grabbers, and this would kill
the bandwidth even on the PCIE bus. I don't know if there is any
better way around this. The video data from the frame grabber is
RGB565 format now. Any suggestions would be appreciated.
By the way, I tested my program on intel's on board graphics
chip 965G, and the situation got worse. Please give some hints.
Thank you very much.
1) A PBO should overcome the 2nd copy, and might even be async, and make a
big difference here.
2) Load the texture data as an unsigned int intensity16, (no conversion)
have a 65k long RGBA 2nd texture as the "color table" and do a shader --
dependent texture read to convert your unsigned int to rgb output. Start
without the shader to see if the 16-bit load is any faster. Might not work
due to precision problems or texture length problems, I dunno.
A more complicated shader could do the swizzle for ya in the GPU, anyway,
maybe directly w/o a dependent read.
There's probably lots of work on this already -- maybe even the OpenGL video
codecs for mplayerhq (http://mplayerhq.hu) or similar.
-jbw
.
- References:
- Fast 2D Blit and Fast Texture Upload
- From: sorcerer
- Fast 2D Blit and Fast Texture Upload
- Prev by Date: Fast 2D Blit and Fast Texture Upload
- Next by Date: Pixelization Effect
- Previous by thread: Fast 2D Blit and Fast Texture Upload
- Next by thread: Pixelization Effect
- Index(es):
Relevant Pages
|