8/11/10

3d Modelling : the challenges for Compression

Quick reminder, qoob is a modeller and library that stores 3d models by a series of modelling commands, not by storing the mesh. The hope is to get smaller models for intros. As I write qoob, three fundamental challenges are emerging:
  1. The size of the qoob library with the modelling commands code
  2. Storing modelling commands in a compact way
  3. Picking and selection compression
The Size of the Library
The Qoob library is around 2k after compression. It must be included in intros to decode cube models. As such the existing library simply wont work at 4k. Its fine for 64k and for 8k/16k. However I'm also hopeful a much smaller one could be produced. There would be three keys:
  • use DX maths and subdivision
  • reduce the modelling options and target abstract objects ONLY
  • use default values for all functions so storing a model means storing only a series of commands - no parameters.
That may seem crazy but there are a huge range of abstract objects that can be done this way. From plants to cubes to geometric abstracts. Even basic human figures.The first characters I showed on this blog were done this way!

Storing Modelling Commands

I made the mistake of targetting a low number of entry points in qoob lib. This kept the library small and tight but meant even a simple extrude needed 9 bytes to store!! However after considerable redesign, I've managed to reduce the average byte count per modelling command to less than 2. The result is less compressible but not by much and a factor of 5x on average is a very good saving. 50 modelling commands is a enough for many models. Thats 100 bytes BEFORE compression.

The first trick was to increase the number of possible commands (eg translate became translate, translateX, translateY, translateZ). Translate in X requires only one float whereas translate requires 3. Often the modeller does not need the full expressive power. In most cases the reduced bytes needed to store the new commands parameters far outweigh the extra code, which is highly compressible.


The second trick comes from accepting that we dont need the full range and accuracy of floats. A single byte is enough for most functions and should greater range be required, the modeller can always concatenate. For example, a translate in X can be expressed as a single byte. If further translation is required, the modeller simply translates again, however usually this is not the case.


Selection Compression

A talk with iq depressed me. Essentially selections needs 5 bytes (command and quad id) to store. In addition polygon ids are random (over many picks) and so do not compress. Iq guessed that 50 picks would render qoob models less effective than mesh based compression.
Well this is a fundamental mathematical barrier. The trick is to avoid it. Procedural selection comes to the rescue. Commands like "select similar" from wings3d mean that the modeller picks a single seed polygon then expands the selection using a command or two.
I'm still exploring the solution here so more when I'm sure what is a good thing but Im pleased with what I have so far.

So qoob lib got bigger but that could be fixed by porting to dx and reducing flexibility. Models got much smaller and picking is nearly solved but again costs bytes in the library.

8/9/10

Simple Algebraic Surfaces Reference

Serious size coding involves trawling forums and the net for ideas and resources. Here is a little link that came over twitter this morning with a bunch of tiny algebraic surfaces that may, or may not, be useful:

http://www.freigeist.cc/gallery.html

7/19/10

Qoob Matures



Qoob now has about 80% of the functionality I want in a basic modeller, at least for now. For personal reasons I will take a break from coding for a few weeks and leave you with a few more sample images. The next step will be to make an actual intro using the tool.



Click on the images for larger versions. I've managed to get some organic, some holes and some asymmetry into the modeller. Undo is implemented (well,ok, one level). The next major step is to put the geometry lib into a 64k framework.





But a burning question is - how big is the Qoob library? Whats the overhead?
Here is a test I performed.
Using crinkler 1.0 ( I cant get 1.1 to work for me) I constructed a minimal OGL program : 600 bytes after compression. I then added Qoob library and called each function once to force inclusion. The program size was 2.4k with crinkler settings of /CRINKLER /COMPMODE:SLOW /ORDERTRIES:3000 /UNSAFEIMPORT. No smarts here to help crinkler and probably not the right settings. Likely, the library can shrink a bit but also could grow a little with the few functions I have left to put in. Its safe to assume therefore that <2k is the Qoob overhead.

2k - thats good, better than I expected. Very useful for 64k. The models range from 3 bytes (a chamfered cube) to 200 for the characters.

Maybe its worth rewriting this using DirectX and getting it into 4ks after all? The 4k might be non-demoscene standard - a sort of model show in 4k...but thats an achievement by itself in some ways.

Shorter code for Point in Interval

Nice little coding trick for checking if a value lies within two other values. The result is faster and (slightly) smaller than checking both values. The trick could be extended to do bounding box intersections for collision detection for example.

Neat trick


It resolves down to this macro:

#define InRange(ch, a, b) (unsigned((ch) - (a)) <= (b) - (a))

.cool

7/18/10

Qoob Character modelling


So, Qoob, the tiny modeller, finally is competent enough to model simple characters. Its all a bit pikachu for now - but that may be more my inner pokemon than the tool, not sure yet. The qoob library grew hardly at all to support this, only the modeller.

Characters are about 200 bytes. Thats ok but not great. The culprit is picking. You do quite a lot of polygon picking to model the limbs, eyes, nose etc. Click on the images to see larger versions.



Picking consists of a command (1 byte) and a polygon (3 bytes). Thats 4 bytes before compression. Assuming the command compresses well, the specific polygon id is the problem. A lot of the time the first byte will be zero which will help. Still, we can assume a total of 3 bytes on average per pick after compression. These models take about 50 picks. Hence 150 bytes as a base. Ideas:
1. Store the command pick n times, (meaning select the next polygon n away in the array). This means great compression but even so, thats a lot of data to compress and there may be problems.
2. Duplicate what the modeller does, by storing the 2d pick co-ordinates and have qoob lib project that into polygon space. Well its 2-3 bytes anyway and then the projection code. So no.
3. Delta encode the polygon id. In my modelling at least this would help often. Polygons being picked often come from the same base polygon in the model before subdivision and therefore their ids are similar.
4. Exploit hierarchy of subdivision more ... have to think here. In theory a polygon might be identified by 3 bits for original face of cube (the base input)2 bits x n where n is the subdivision level.
5. Restrict picking to be pick + grow pick. This would help with large groups of polys but the modeller would go nuts.

... hmm need to think more, none of the above convince me.

7/10/10

3d Modelling thoughts.


The objective of Qoob is to be a demoscene modeller that targets 64k intros. It should allow hundreds of models in a 64k. That means man made and organic. Organic seems to require arbitrary extrusion (think pipe or ribbon). It allows for models like loonies use in their 4ks eg benetoite and, more impressively, 64ks such as Flight666. The maths for modelling with arbitrary extrusions is quite large (>500 bytes) and adds a lot of bytes to Qoob which is already growing in size.

In addition, Qoob must support selecting individual polygons for arbitrary complex shapes. That will be the key to true modelling. But selecting, say 50 polygons currently requires:
select p1
select p2
select p3
... 50 selections
to be stored and compressed. Compressing the command is obvious and cool. I'll just store the command queue and the data queue seperately. The data for each entry is four bytes and may not compress well. I could try sorting and delta encoding - the usual tricks - of course. This worries me.

Size wise then, I'm struggling now: selections and extrusions add lots of bytes. ONe to the lib and one to the models. Suggestions are welcome.

However, on the bright side, Qoob could support displacements on large and small scale, like sculpting. I might therefore be able to create highly detailed geometry with bumps and ridges, for very little cost.

In the mean time, some screenies of what can be done without such functions:







p.s. Hope people like the new design for the pages here. Also check out the demotivation demoscene picture to the right (with apologies to iq/mentor) as elevated races toward 500 thumbs on Pouet.

7/6/10

Catmull Clark Subdivision Musings


Before you read on. Blogger is simply crap for posting code. I've done my best for formatting but if you see dodgy symbols or things that make no sense - blame blogger as this code is right out of the compiler.

Modellers seems to like quads. Catmull Clark subdivision smooths and maintains a quads only model. Doo-sabin introduces triangles. Loop works on triangles. So I decided to use Catmull clark in qoob. I got it working last night after spending a whole weekend when I should have been in the sun. The model above, I estimate, would be around 22 bytes to store compressed. Its approaching organic but is still embarressingly geometrical. Thats has to be the next goal, to produce some organic stuff.

Most code on the net works using winged edge data structures. Classes abound. The examples are no use for size coders. Fortunately iq tackled CCSD in his 2007 presentation for breakpoint, Tricks and techniques for RGBAs Past and Future Intros.

(By the way his normal a mesh trick in that presentation is without doubt the best piece of size coding I've seen.)

Firstly wikipedia has a description of the algorithm. Unfortunately, its not totally clear what an edge point is -the midpoint of an edge or the new displaced midpoint of the edge (its the first). But there are other references to clear this up. IQ proved its possible to do Catmull clark without winged edges but still he uses edge ids. I wanted to avoid this and did so but in doing so, each time I add a vertex I loop through all vertices to see if an identical one already exists. I think its a case of what I gain on the flat I lose on the hill. Here is the final code in old style C (ie no C++ operators for maths) which makes it look horrendous - thats why operators were introduced :-).

addVert is key. It checks all existing vertices for an exact duplicate (exact is fine in this case) and returns either the index to the existing one or adds the new vertex and returns its index. addquad adds a quad to the model. The last line deletes old quadsr. The routine supports subdivision without displacement and CCSD, hence the if(smooth) in places.

Vert disp[MAXSIZE]; // displacements
uint val[MAXSIZE]; // valences

void subdObj (Object *ob, uint smooth) {
Vert tmp,ep,fp;
uint eid[4],cni; //indices for new vertices, edges and centre
uint onq = ob->nq;
uint onv = ob->nv;

ZeroMemory(disp,MAXSIZE*sizeof(Vert));
ZeroMemory(val,MAXSIZE*sizeof(uint));
for (uint iq=0; iq < onq; iq++ ) {
centroid (ob, iq, &fp);
cni=addVert(ob,&fp);
for (uint v=0; v<4; v++ ) { // for each vertex in each quad quad
lerp(&(QV(ob,iq,v)), &(QV(ob,iq,(v 1)&3)), 0.5f, &ep);
copy(&ep,&tmp);
if (smooth) smul(&tmp,0.5f);
eid[v]=addVert(ob,&tmp); // orignal vertices of edge/4.0
// add contrib of face points to displacement for new edge points
smadd (&fp,0.25f, &(disp[eid[v]]), &(disp[eid[v]]) ); // add fp/4 to edge vector
// add face points to displacement for original point
add (&fp, &(disp[ob->q[iq][v]]),&(disp[ob->q[iq][v]]) );
// add 2x edge points to disp for original points
smadd (&ep, 2.0f, &(disp[ob->q[iq][v]]),&(disp[ob->q[iq][v]]) );
val[ob->q[iq][v]] ; // add up valences
}
// now construct four new quads
for (uint k=0;k<4;k++ ) addQuad (ob, ob->q[iq][k], eid[k], cni, eid[(k-1)&3], selected(iq));
}
// now displace all vertices...face points will be unaffected
if (smooth) for (uint v=0; v <q->nv; v++ ) {
if (v < onv) { // this must be an original point
smul (&(disp[v]), 1.0f/val[v]); // average the displacement values
smadd (&(ob->v[v]), val[v]-3.0f, &(disp[v]), &(ob->v[v]) ); // correct for extraordinary vertex
smul (&(ob->v[v]), 1.0f/val[v]); // divide through by valence
} else add (&(ob->v[v]), &(disp[v]), &(ob->v[v]));
}
for (uint iq=0; iq<onq; iq++ ) delQuad(ob,0);
}


Well depending how you count lines of code its 20-25. addVert and addQuad make it 25-30. So lets call it 30 lines of code Catmull Clark in C. C++ operators would make this a bit shorter. IQ claims 50 lines of code but if counted in the same way, his code is about the same 30-35.

The way the code works is a two pass algorithm first over quads of the original model and secondly over the vertices of the new model. The first pass subdivides and accumulates displacement in a seperate vector. This enables the first pass to calculate all the valences and store them before averaging the final displacement and applying it in the correct manner depending on the type of point (original, edge or face). There are a couple of gotchas but thats the basic idea. No winged edge, infact no edges at all.

Coding style sucks. Happy with result so thought I'd publish it.

One cool thing is the model pictures is about 22 bytes to store. A champfered cube is about 12 bytes and a sphere similar. It means we are looking at qoob providing a library of demo type objects in a few hundred bytes. The spheres are made of quads and do not suffer from peaks at top and bottom. Ther qoob library is a few kilo unfortunately. DX would help but I think I'll target 64k so not to worry.

7/1/10

Qoob Progressing

My modeller is progressing. Heres a screen shot:



The model in the picture takes about 30 bytes to store. Smoothing is next up.
Organic stuff is escaping me right now but I'll work on that.

6/30/10

stupid idea 247: worlds smallest vector maths library

Here is a tiny 3d vector maths library:
typedef float vec3[3];
vec3 *vec(vec3 *v, float x) {v[0]=v[1]=v[2]=x;return v;}

float *mav(vec3 *v1,vec3 *v2,vec3 *v3,vec3 *v4,vec3 *v5){

for (int i=0; i!=3; i++) v3[i]=v1[i]*v4[i]+v2[i]*v5[i];

return v3[0]+v3[1]+v3[2];

}

vec3 VZERO={0,0,0};
vec3 VONE={1,1,1}; vec3 VMONE={-1,-1,-1};

Thats it. It can do zero, set, add, sub, length2, dotprod, negate, lerp, centroid. e.g add is :
mav(&v1, &v2, &v3, &VONE, &VONE ); // v3=v2+v1

Lerp means that centroid (the centre vertex of a quad) can be found without a division by using two lerps. For a moment it seems cool until you realise the calls to it dont compress all that well.
The challenges remaining are xprod, normalise and max/min. then I have everything for handling my 3d modeller qoob.If I can make minor adjustments to get those, this might be worth persuing.

6/16/10

Very .cool

GLSL Minifier.

"GLSL Minifier is a tool to pack GLSL shader code. It generates a code as small as possible that is equivalent to the input code. The generated code is also designed to compress well. It has been optimized for use with Crinkler, but should perform well with any other compression tool. GLSL Minifier generates a C header file you can include from your code."

Very .cool

6/14/10

3d modelling for Intros ? qoob.

I'm investigating meshes for a while. First attempts with Pohar from Rebels resulted in some pretty detailed modelling for 4k. It was the usual way. Create a mesh in a modelling tool, compress it, load it, smooth it, just the same as iqs approaches to the problem. As dx provides smoothing (loop I think), this results in tiny code and models of 500bytes or more as seed meshes. One of the problems with this technique is the model is equally "smooth" all over at the end. Its hard to have smooth parts and rough parts. Iq overcomes this by coding displacements.

This image fo a dragon was done using this technique and the result (with crinkler, shader, mesh and mesh code) was just over 2k...

Theres a good read about meshes in intros at Pouet.

I once did a 1k with several curved "lathe" objects, but it was com dropping and thats unreliable. I haven't fixed that so no code on Pouet right now. Maybe i'll try again with crinkler at some point. Lathes seem to be an option for modelling at 1k/4k. In the code I developed for gcc the models were just 8 bytes each before compression. The code to create the objects was around 350 after compression and the objects are curved: there is non-linear interpolation between points in the model. Click on the image to see what I mean.
But Lathes are limited.

Another technique I looked into was intersections of regular primitives. This can often produce surprising results. The idea is simply to rotate/translate a single primitive. The two screenshots below illustrate some results. The first is a rotated cone, resulting in a star. The second is a very small routine (180 bytes after compression) to generate organic like structures. Again no randomness, just rotate, translate, rotate stuff.




You can see the last one live in the second half of the awesomely coloured :-) 4k intro Fruit of the Loop.

But my goal is always to get away from embarressingly geometrical to something that looks modelled. It occurred to me that crinkler is very successful because it works at compile stage for compression, its not simply an exe packer. Thus my latest project is a modeller, the idea being to use the modelling commands to better compress the end model. qoob is a simple subdivision modeller based on wings3d functionality but very cut down. Its a fresh project but already I can see that no way will this be useful for 4k, only 64k. Thats disappointing but it comes from the fact that the library used to reproduce objects is 2-3k. An early screenshot is not impressive:


The idea remains right though: this is the modelling equivalent of a texture builder.

11/5/09

FRequency To the Road of Ribbon in 1k for win32

Instead of producing my own stuff, I've spent time crunching down the original Linux 1k from Frequency. Mainly I did it to see it on my ati card (the original was messed up) but also to see if I could get a win32 version, written in C, down to 1024 bytes.

During the crunching it became obvious I'd have to make a few compromises:
  • The colour of the ribbon is the same as the walls. This is the biggest artistic compromise and definitely it can be criticised as compromising the original. To compensate I added a more dramatic camera path that gets very close to the ribbon.
  • Clock is less accurate and resets, causing a jump to a new viewpoint. This is not as good as the original.
  • View angle is smaller - caused by a new method of calculating view vectors. Not a big issue.
Thats really it. On the plus side, I've cured a bug that caused the ribbon to be reflected many times and I made the new version run a bit faster enabling a higher resolution. I've completely rewritten the code that generates the olour but I've stayed as close to the original as I could. I hope its not noticeable too much. However, mine is darker, perhaps I can cure this later.

This is a fantastic intro and not one I thought could be done in 1k in win32 in C when I started. Linux 1ks have smaller frameworks and so have more space available for the intro; and asm is of course more space efficient than C.

Here is the source code and an exe to try out. I hope you agree it has kept the essence of the original whilst being 1k in win32. I used crinkler 1.1 (even though 1.2 would produce smaller code) as 1.2 doesn't work for me. So the intro wont work under win7!!

Respect to FRequency from auld^titan.

11/1/09

Frequencies To The Road of Ribbon Reactivated

Frequency did a really great Linux 1k recently. To the Road of the Ribbon is a classic labyrinth but (to my knowledge) for the first time with reflections and even a ribbon added. It fits in Linux at 1k nicely but is about 1240 bytes under windows, though xt95 made no attempt to make it smaller under windows, it was small enough under Linux. For some reason it was buggy on my ATi box so out of interest I converted the Linux assembler to C and ported to windows.

I couldn't quite get the code down to 1k using crinkler under windows (setup for windows is larger). Perhaps some jolly fellow can convert it back to asm? You can get my C version here, with a much compressed shader. Its around 1070 bytes using crinkler.

There are some nice tricks in this intro. I like the hi-res clock passed in through glColor4ubv:

t=timeGetTime();
glColor4ubv((unsigned char *)(&t));

and then in the shader (my version):

float w=dot(vec3(gl_Color),vec3(256/256,256,256*256))*.256;

The dot product was a great idea to get a hi-res clock, not sure if its original to Frequency but its very cool. I optimised the size a little by making the original vec3(1,256,65536) into the form above.

10/21/09

Bad intro year

I started this year trying to code a full blown demo which is a long term work in progress now. In the mean time work took over and coding hasn't been possible. The hiatus has given me two very clear ideas for intros. I'll see if they come out as 4ks or something bigger later.

1/20/09

GLSL Shader Creation: Geometry, Vertex and Fragment

To the point, I recently expanded my tiny shader creation framework to include geometry shaders. Not much to say about it. Notice that Geometry and Vertex shaders are optional. The code is getting quite long so it might be worth testing if loading the function pointers in a loop and indexing them would be smaller, nontheless its highly repetitive so crinkler should cope well.

GLuint ShaderCompile(const char *gs, const char *vs, const char *fs) {
GLuint p,s;
p = ((PFNGLCREATEPROGRAMPROC)(wglGetProcAddress("glCreateProgram")))();
if (gs!=NULL) {
//if there is a geom shader...
s=((PFNGLCREATESHADERPROC)wglGetProcAddress("glCreateShader"))(GL_GEOMETRY_SHADER_EXT);
((PFNGLSHADERSOURCEPROC)wglGetProcAddress("glShaderSource")) (s, 1, &gs, NULL);
((PFNGLCOMPILESHADERPROC)wglGetProcAddress("glCompileShader"))(s);
((PFNGLATTACHSHADERPROC)wglGetProcAddress("glAttachShader")) (p,s);
}
if (vs!=NULL) {
//if there is a vertex shader...
s=((PFNGLCREATESHADERPROC)wglGetProcAddress("glCreateShader"))(GL_VERTEX_SHADER);
((PFNGLSHADERSOURCEPROC)wglGetProcAddress("glShaderSource")) (s, 1, &vs, NULL);
((PFNGLCOMPILESHADERPROC)wglGetProcAddress("glCompileShader"))(s);
((PFNGLATTACHSHADERPROC)wglGetProcAddress("glAttachShader")) (p,s);
}
s=((PFNGLCREATESHADERPROC)wglGetProcAddress("glCreateShader"))(GL_FRAGMENT_SHADER);
((PFNGLSHADERSOURCEPROC)wglGetProcAddress("glShaderSource")) (s, 1, &fs, NULL);
((PFNGLCOMPILESHADERPROC)wglGetProcAddress("glCompileShader"))(s);
((PFNGLATTACHSHADERPROC)wglGetProcAddress("glAttachShader")) (p,s);
((PFNGLLINKPROGRAMPROC)wglGetProcAddress("glLinkProgram"))(p);
return p;
}


(I'll return to DirectX and OpenGL next post.)

1/12/09

How to Use D3Dx and OpenGL together : Part 1

I love OGL but the OGL SDK is a shoddy collection of web links. Worse, although as size coders we have GLU, DX coders have a lot of advantages in the standard libraries they have access to. Fortunately, DirectX is quite well designed in the way libraries are broken down and from OGL its possible to use the d3dx support library without rendering using D3d. Its a little overhead of course compared to rendering everything in d3d, but in terms of bytes at 4k its almost nothing.

Inside d3dx are cool routines for getting normals to a mesh, smoothing meshes, raytracing (!) which can be used for collisions, splines for motion and camera paths, the elusive cube and torus, neither of which for some inexplicable reason appear in GLU, and a whole range of maths functions. In this little blog I'll show code to set up OGL and d3d together.


LPDIRECT3D9 pD3D;
LPDIRECT3DDEVICE9 pd3dDevice;

void initD3D(HWND hWnd)
{
D3DPRESENT_PARAMETERS d3dpp;

pD3D = Direct3DCreate9 (D3D_SDK_VERSION);
d3dpp.Windowed = TRUE;
d3dpp.SwapEffect = D3DSWAPEFFECT_DISCARD;
pD3D->CreateDevice( 0, D3DDEVTYPE_HAL, hWnd,
D3DCREATE_SOFTWARE_VERTEXPROCESSING, &d3dpp, &pd3dDevice );
}

The code above sets up D3D. To be precise, it initialises D3D enough so that we can use functions from d3dx. To be more precise, DirectX 9 functions. It may be possible to get smaller but I'm a beginner with d3d, so send me an email if you can beat it. As both pD3D and pd3dDevice are required by d3dx routines, we make them global. It may be possible to use a static declaration for d3dpp to make this smaller byt a few bytes.

Now we need some code to initialise windows and opengl and use this routine to initialise D3D:


static PIXELFORMATDESCRIPTOR pfd={
0, // Size Of PFD... BAD coding, saves bytes
1, PFD_SUPPORT_OPENGL | PFD_DOUBLEBUFFER, 32, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 32, 0, 0, 0, 0, 0, 0, 0
};

void WINAPI WinMainCRTStartup()
{
HWND hWnd = CreateWindow( "EDIT", NULL, WS_POPUP|WS_VISIBLE|WS_MAXIMIZE,
0, 0, 0, 0, 0, 0, 0, 0 );
initD3D(hWnd);
HDC hDC = GetDC( hWnd );
SetPixelFormat ( hDC, ChoosePixelFormat ( hDC, &pfd) , &pfd );
wglMakeCurrent ( hDC, wglCreateContext (hDC) );
ShowCursor(FALSE);
.
.
}

The code above is a little larger than normal because the d3d initialisation requires the window handle as well as the GetDC call, so we have to store it in hWnd variable.

In this blog then, I've presented a tiny framework that initialises win32, opens a window, initialises d3d and initialises opengl. In the next Blog, I'll describe how to use mesh functions in d3dx and render using OpenGL.

1/1/09

Place to live

Joined Titan. Good to have a home :-). Titan gets a lot of bad press. So much so I once decided to leave but luckily they are a forgiving bunch and let me back in. However, inside Titan there are fabulously talented people and its hard to see why Titan hasn't been more successful to this time. Anyway, thanks for letting me back in, I'm already enjoying myself.

12/29/08

Very small Code for Cornell Box

I don't often just add links here (first time) but here is one worth seeing for size coders. Its a complete global illumination solution in 99 lines of C++ code. I notice the exe can be less than 4k.

12/22/08

Code for Moving Camera in GLSL Raytracer

Suppose you have a "two triangles and a shader" program as iq likes to call them. Essentially, you cover the screen with a polygon and then draw everything with a shader. This might be raytracing, for example. The problem is this: how to move the camera around in very few bytes. Better, how to write a very small shader that lets us use OpenGL commands like gluLookAt and glRotate to control the camera?

A good solution hit me today so here is the code. First the main opengl loop to draw that polygon contains:

glRects(-1,-1,1,1);

This is one good advantage for size coding of OGL over D3D. A single polygon covers the screen. We cant use texture coords or colours at the vertices or we would have to define a quad or two triangles. So we are stuck with just the vertices.

Now the magical, tiny, Vertex shader which will give us a moving camera:

varying vec3 v,EP;
void main(){
gl_Position=gl_Vertex;
v = vec3( gl_ModelViewMatrix*gl_Vertex);
EP= vec3( gl_ModelViewMatrix*vec4(0,0,-1,1) );
}

The first line, makes sure that the glRect still covers the screen in the Pixel shader. It does not transform it but leaves it where it is. The second line, however records the transformed vertex in world space as a vec3. This in effect will record the position in world space for every pixel on the screen.

The last line is recording the eye position. Arbitrarily, the eye position is hardcoded here to be one unit in Z away from the screen giving a filed of view of 90 degrees - quite normal for a camera. Note that eye point needs a homogeneous value of 1 as the fourth co-ordinate to work properly.

We could have used vec4 for both lines above and the code would be shorter, but as most raytracing will use vec3 later, yuo can chose to bite the bullet and make the code longer here and shorter in the fragment shader. Horses for courses.

Now in the fragment shader its easy to construct the ray to start tracing:

varying vec3 v,EP;
void main(){
vec3 Ro=EP; //set ray origin
vec3 Rd=v-Ro; //set ray direction

Its that easy. Now you can move the camera in your raytracer using normal OpenGL commands. To finish here is an image from YAST (yet another sphere tracer as I'm calling my glsl raytracer). I'm able to move around the spheres as I chose. As usual, click on the image to see a bigger version.

Out of interest, on an x1950, 1024x768, 30 spheres, one light, shadows and 3 levels of reflection, I'm getting around 50fps.

12/21/08

Getting OpenGL shaders even smaller



Flow2 is a 1k of mine from over 12 months ago. It was 1022 bytes. Since then, several things changed. Firstly, ATi fixed their drivers for *some* cards so that you no longer need a vertex shader/fragment shader pair. As per the glsl spec only a fragment shader is necessary. Fearmoths exploited this in their Linux 1k You Massive Clod. Secondly, last January Crinkler 1.1a was released which compressed better (20 or so bytes). Thirdly, I got better at size reducing C code for crinkler.

So, it was time to revisit an old intro and try to size reduce it. As it happens flow2 didnt have a proper timer so I decided to add one, taking the intro way up over 1024 bytes. Nontheless the current exe is ... wait for it... 890 bytes!!
Thats extra timer code and still 130 bytes smaller. OK, some code. Flow2 is a "one polygon and a shader" style of intro. The point is now you don't need a vertex shader, just a pixel shader, the tiny glsl code can be even smaller:

void setShaders() {
GLuint p,s;
s = ((PFNGLCREATESHADERPROC)wglGetProcAddress("glCreateShader"))(GL_FRAGMENT_SHADER);
((PFNGLSHADERSOURCEPROC)wglGetProcAddress("glShaderSource")) (s, 1, &fsh, NULL);
((PFNGLCOMPILESHADERPROC)wglGetProcAddress("glCompileShader"))(s);
p = ((PFNGLCREATEPROGRAMPROC)wglGetProcAddress("glCreateProgram"))();
((PFNGLATTACHSHADERPROC)wglGetProcAddress("glAttachShader"))(p,s);
((PFNGLLINKPROGRAMPROC)wglGetProcAddress("glLinkProgram"))(p);
((PFNGLUSEPROGRAMPROC) wglGetProcAddress("glUseProgram"))(p);
}


Now the big question...can you fit a synth and music in 130 bytes? Midi for sure but a real synth? The wav header alone is around 43 bytes after compression, add in PlaySoundA and already half the bytes are gone. It seems unlikely but ... well...