PublicArchive/bevy

mirror of https://github.com/bevyengine/bevy.git synced 2026-07-01 00:05:45 -04:00

Author	SHA1	Message	Date
Visse	6f270d4776	Expose pipeline constants to materials (#24502 ) # Objective Allow materials to use pipeline constants ([pipeline-overridable constants](https://www.w3.org/TR/WGSL/#override-decls)). They are already available in wgpu, but bevy didn't expose them. ## Solution Expose constants in `RenderPipelineDescriptor` & `ComputePipelineDescriptor`, allowing materials to specify them in their specilize function. Note: I had to remove the `Eq` derive from `ComputePipelineDescriptor`, `VertexState` and `FragmentState`, due to the new f64 field. It was already a bit inconsistent with `RenderPipelineDescriptor` not having it. ## Testing - Ran `cargo check` - Created an example & ran it - Couldn't run `cargo test` due to it taking looots of disk space to run, but I have a hard time seeing it break something at runtime ## Showcase See the added example, where pipeline constants are used to change the `LEVELS` override in WGSL. <img width="760" height="289" alt="Screenshot from 2026-05-31 09-46-05" src="https://github.com/user-attachments/assets/6902757c-aea4-4b91-9ff0-e653ce4c3448" />	2026-06-09 00:06:01 +00:00
Luo Zhihao	7517c61ecd	Add options to lower precision/compressed vertex buffer (#21926 ) # Objective Resolves #21902. ## Solution This PR adopts a relatively transparent approach to reduce the GPU vertex buffer size. On CPU-side mesh can still use uncompressed Float32 data, and users are not required to insert compressed vertex formats. The vertex data is automatically processed into lower-precision/octahedral encoded data when uploading to the GPU. To enable vertex attribute compression, just set the `attribute_compression` field of Mesh, or set `mesh_attribute_compression` of GltfLoaderSettings. If enabled, normal and tangent will be octahedral encoded Unorm16x2, uv0, uv1, joint weight and color will be corresponding Unorm16 or Float16. I also provide Unorm8x4 for vertex color if hdr isn't needed. Update 2026-2-16 Removed previous approach that automatically compresses vertex buffer according to flags when uploading to GPU. Instead, I added `compressed_mesh` method to Mesh to construct compressed Mesh ahead of time. GltfLoader can also opt-in mesh compressing when loading. I also add an option to convert indices to u16, though I believe blender gltf exporter already uses u16 indices when possible. ## Testing Run `many_cubes`, `many_foxes`, `many_morph_targets` with `--vertex-compression` to test 3d. Run `bevymark` with `sprite_mesh` to test 2d, because `SpriteMesh` uses compressed quad mesh now. --------- Co-authored-by: Greeble <166992735+greeble-dev@users.noreply.github.com>	2026-06-02 03:22:56 +00:00
Christophe Dehais	5ec83e1185	Improve Order Independent Transparency example (#22781 ) # Objectives - Use a cleaner UI inspired by other examples - Add a scene with custom material to fix #20297 ## Testing Running the example ## Showcase <details> <summary>Click to view showcase</summary> <img width="872" height="732" alt="Screenshot From 2026-02-06 11-35-19" src="https://github.com/user-attachments/assets/cc294c33-bb53-4ed4-9dce-7558f3bb8fee" /> <img width="872" height="732" alt="Screenshot From 2026-02-06 11-35-30" src="https://github.com/user-attachments/assets/c2b79651-c99e-4bcc-8089-518d11631b9f" /> </details> --------- Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2026-05-21 15:13:25 +00:00
Duncan	afd576f380	Alpha discard (#11895 ) # Objective I want to call the `pbr_functions::alpha_discard` shader function from my own material shader, but it only takes a `StandardMaterial` parameter. ## Solution This PR replaces the `StandardMaterial` parameter with less restrictive ones so my shader can call it. --- ## Changelog - The signature of `pbr_functions::alpha_discard` changed by replacing `StandardMaterial` with only the `flags` and `alpha_cutoff` fields. ## Migration Guide Replace this: ```wgsl pbr_functions::alpha_discard(material, ...); ``` with this: ```wgsl pbr_functions::alpha_discard(material.flags, material.alpha_cutoff, ...); ``` --------- Co-authored-by: Duncan Fairbanks <duncanfairbanks6@gmail.com>	2026-05-21 08:49:32 +00:00
Patrick Walton	1b0f112d90	Implement GPU clustering for lights, light probes, and decals. (#23036 ) Currently, Bevy clusters lights on the CPU. This is generally not considered a best practice any longer, and it can be a bottleneck in workloads like `many_lights`. Moreover, it prevents GPU systems like [Hanabi] from creating clusterable objects such as lights and decals without a round trip to the CPU. This PR introduces GPU light clustering when supported by the hardware. The algorithm is the same as the existing GPU light clustering, but parallelized over all clusters, and the resulting on-GPU format for clusters is unchanged. GPU light clustering uses the hardware rasterizer for compute purposes as a way to automatically distribute workloads within 2D axis-aligned bounding boxes without actually rendering any pixels, a first for Bevy. The algorithm is as follows, with each step corresponding to a raster or compute command: 1. Z slicing: We have a 3D cluster froxel grid of size WxHxD and seek to rasterize D axis-aligned quads, each of size WxH, representing the range of each clusterable object. In this compute phase, we generate D indirect instances for each clusterable object for the subsequent indirect draws. 2. Count rasterization: We use instanced indirect drawing to rasterize each quad generated in step 1 to a viewport of size WxH, with color writes disabled. Each rasterized fragment represents a cluster-object pair. In the fragment shader, we check to see if the object intersects the cluster, and, if it does, we atomically bump a counter corresponding to the number of objects of the given type intersecting the cluster in question. We don't record the ID of the object in this phase; we simply count the number of objects. 3. Local allocation: Now that we know the number of objects of each type in each cluster, we can proceed to allocate space in the clustered object buffer for each clustered object list. To do this, we need to perform a [prefix sum] operation so that each list is tightly packed with the others. For example, if adjacent clusters have 2, 5, and 3 objects, they'll be allocated at offsets 0, 2, and 7 respectively. This local step uses a [Hillis-Steele scan] in shared memory to compute the prefix sum of each chunk of 256 clusters. We can't go beyond 256 clusters in this local step because 256 is the maximum workgroup size in `wgpu`. 4. Global allocation: To deal with the fact that we can't calculate prefix sums beyond 256 clusters in step 3, we employ this second step that does a sequential loop over every 256-cluster chunk, propagating the prefix sum. At the end of this step, every list of clustered objects is allocated. 5. Populate rasterization: Finally, we issue an instanced indirect draw command using the same parameters as step (2). We test each cluster-object pair for intersection, and, if the test passes, we record the ID of each clustered object into the correct space in the list, using a scratch pad buffer of atomics to store the position of the next object in each list. The buffer of clustered objects has a fixed size and can overflow. We detect this condition via asynchronous CPU readback and automatically grow the buffer for subsequent frames. In this case, we also log a message so that the developer can choose a larger initial buffer size and avoid any incorrect frames. Additionally, like #22874, the automatic clustering heuristics are dynamically adjusted from frame to frame, by recording statistics on the GPU and using CPU readback to download them back to the CPU for processing. As part of this PR, I refactored clustered visibility so that clustered objects go through the same `ViewVisibility` system as other objects, instead of using `VisibleClusterableObjects`. This was a nice simplification. On the `many_lights` benchmark, with about 8,000 lights visible out of 100,000, this process takes approximately 0.099 ms on my NVIDIA GeForce RTX 4070 Laptop GPU. The AMD Ryzen 9 8945HS CPU, however, takes 2.12 ms to do the same task. The GPU version is therefore a 21x speedup. `main` `assign_objects_to_clusters` time, 2.12 ms: <img width="2756" height="1800" alt="Screenshot 2026-02-17 222757" src="https://github.com/user-attachments/assets/66341ad2-96f2-4e4a-87ee-fe3462bc05de" /> GPU clustering GPU time, 0.099 ms: <img width="2756" height="1800" alt="Screenshot 2026-02-17 222458" src="https://github.com/user-attachments/assets/18e2e0ae-a946-4b80-b38a-0543e76ebc02" /> `main`, 5.71 ms median frame time, 175 FPS: <img width="2756" height="1800" alt="Screenshot 2026-02-17 222243" src="https://github.com/user-attachments/assets/111c8e22-414f-4ee1-95fa-d7cfe422c2ab" /> GPU clustering, 4.88 ms median frame time, 205 FPS: <img width="2756" height="1800" alt="Screenshot 2026-02-17 222256" src="https://github.com/user-attachments/assets/0a662e88-a1b9-49c8-8bab-cc12b46cd079" /> [Hanabi]: https://github.com/djeedai/bevy_hanabi [prefix sum]: https://en.wikipedia.org/wiki/Prefix_sum [Hillis-Steele scan]: https://en.wikipedia.org/wiki/Prefix_sum#Algorithm_1:_Shorter_span,_more_parallel ## Alice's PM Note from @kfc35 Fixes https://github.com/bevyengine/bevy/issues/22957 and also fixes https://github.com/bevyengine/bevy/issues/22904.	2026-02-28 17:07:33 +00:00
Jordan Halase	5e1630bfd8	Fix 16 byte alignment typo (WebGL 2: 16 bit -> 16 byte) (#23124 ) # Objective WebGL 2 requires 16 byte UBO alignment. Some comments incorrectly state 16 bits. ## Solution Fix comment typos. ## Testing N/A --- ## Showcase N/A	2026-02-24 00:51:54 +00:00
Chris Biscardi	71ce303ec2	compute-shader mesh generation example (#22296 ) # Objective People have been asking how to get a compute shader-built mesh into bevy's "stuff". Some people want to control the lifetime of the mesh via Handle, and others don't don't how to set data in bind groups. ## Solution a new example that shows how to initialize a mesh handle with a render_world usage mesh, and then put the output of the compute shader into the mesh_allocator slab for the mesh. The demo creates a scene with a camera, light, a circular base mesh, and an empty "cube to be" mesh that is shared by cloning the handle across two entities. The compute shader then fills in the data directly into the mesh_allocator slabs for the vertex/index buffers. If the compute shader failed, there would be no cube meshes showing as the data would be empty. ## Testing ``` cargo run --example compute_mesh ``` --- ## Showcase <img width="3392" height="2106" alt="screenshot-2025-12-29-at-16 06 48@2x" src="https://github.com/user-attachments/assets/88d8fed4-e3c1-418e-bb04-6f08d673403a" />	2026-01-16 00:06:16 +00:00
Patrick Walton	4189ba072d	Provide a mechanism for applications to invoke the single-pass downsampler. (#22286 ) The [AMD FidelityFX single-pass downsampler] (SPD) is the fastest way to generate mipmap levels of a texture. Bevy currently has two separate ports of that algorithm to WGSL: one for use in the environment map generation and one for use on the depth buffer for the purposes of occlusion culling (though the latter isn't the best use of it). Absent is any mechanism to use the single-pass downsampler to generate mipmap levels of a color texture for typical use in rendering. This is a standard feature in game engines: for example, Unity has [`GenerateMips`] and Unreal has [`bAutoGenerateMips`]. This PR adds a mechanism by which applications can invoke SPD to generate mipmap levels for any `Image`. Using this mechanism is a two step process. First, the application adds the `Handle<Image>` to a resource, `MipGenerationJobs` and associates it with a phase, which is an arbitrary ID chosen by the application. Second, the application adds a `MipGenerationNode` for that phase to the render graph. During rendering, the `MipGenerationNode` invokes SPD to generate a full mipmap chain for all textures in that phase. The reason why mipmap generation jobs are associated with phases is that the generation of mipmaps may need to occur at precise points in the application rendering cycle. For example, consider the common situation of a mipmapped portal texture. The mipmaps must be generated after the portal is rendered, but before the object in the main world displaying the portal texture is drawn. The phased approach taken in this PR allows complex dependencies like this to be expressed using the node graph feature that Bevy already possesses. (In the future, if render graphs are removed in favor of systems, this approach can naturally be reframed in terms of systems, so this patch contains no hazards in that regard.) Note that this patch by itself doesn't automatically generate mipmaps for imported textures that don't have them the way that [`bevy_mod_mipmap_generator`] does, in order to keep this patch relatively small and self-contained. However, it'd be straightforward to either (a) extend `bevy_mod_mipmap_generator`, (b) write another plugin, and/or (c) add a new feature to Bevy itself, all built on top of this PR, to support automatic GPU mip generation for image assets that don't have them. A new example, `dynamic_mip_generation`, has been added. This is a 2D example that produces a texture at runtime on the CPU and invokes the new `MipGenerationNode` that this patch adds to generate mipmaps for that texture at runtime. The colors of the texture are randomly generated, and UI for the example allows the texture to be regenerated and for the size to be adjusted; this proves that the mipmap levels for the texture are indeed generated at runtime and not pre-calculated at build time. Note that, although the example is 2D, the feature that this patch adds can be equally used in 2D and 3D. [AMD FidelityFX single-pass downsampler]: https://gpuopen.com/fidelityfx-spd/ [`GenerateMips`]: https://docs.unity3d.com/ScriptReference/Rendering.CommandBuffer.GenerateMips.html [`bAutoGenerateMips`]: https://dev.epicgames.com/documentation/en-us/unreal-engine/API/Plugins/DisplayClusterConfiguration/FDisplayClusterC-_33/bAutoGenerateMips [`bevy_mod_mipmap_generator`]: https://github.com/DGriffin91/bevy_mod_mipmap_generator	2025-12-31 23:05:54 +00:00
IceSentry	681751647a	Add FullscreenMaterial (#20414 ) # Objective - Users often want to run a fullscreen shader but the current solution involves copying the custom_post_processing example which is a 350 line file with a lot of low level wgpu complexity. Users shouldn't have to deal with that just to make a fullscreen shader ## Solution - Introduce a new FullscreenMaterial trait and FullscsreenMaterialPlugin - This new material will run a fullscreen triangle with the specified shader. It builds on top of the existing FullscreenShader infrastructure - It lets user customize the node ordering. There's no defaults right now becausae it's intended as a bit of a primitive plugin. Eventually we could have some kind of default for custom post processing ## Testing Made a new fullscreen_material example and made sure it works ## Follow up Once this is merged there are various things that should be done to improve it. Add the option to bind the depth texture, offer defaults for post processing, use a full AsBindGroup, add a way to bind the gbuffer. --------- Co-authored-by: JMS55 <47158642+JMS55@users.noreply.github.com> Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>	2025-12-14 22:35:20 +00:00
mgi388	d60a1b8166	Add MeshTag to array_texture example to demonstrate layer selection in shader (#21989 ) ## Objective - When looking at the `array_texture` example, it wasn't clear to me how I could send the "layer" to the GPU, but it turns out that [MeshTag is one recommended way](https://discord.com/channels/691052431525675048/866787577687310356/1444495450999754823) to pass this. - The example previously extracted a fake "layer" from the world position, but IIUC this isn't the most realistic way to demonstrate layer selection. ## Solution - Update the `array_texture` example by using `MeshTag`. - Add a system to the example that periodically changes the `MeshTag` on entities to show that the mesh tag can also change dynamically at runtime (and show how easy it is). ## Testing and showcase Before, you can see each cube's texture is fixed. <img width="1280" height="747" alt="image" src="https://github.com/user-attachments/assets/1ffde7db-8110-4431-b4e8-3a5a4ba5c5db" /> After, you can see each cube's texture changes as time passes. https://github.com/user-attachments/assets/b1227659-5886-4d2c-a401-84b80423c798 ---- I'm hoping for a rendering dev to validate this approach is correct, and useful. I think it is, but [I'm only just starting to understand](https://discord.com/channels/691052431525675048/866787577687310356/1444888786478829668) how to use this stuff and it's [possibly not the only way](https://discord.com/channels/691052431525675048/866787577687310356/1444888304020488365) so I don't want to submit this if it's the wrong approach to teach future me's.	2025-12-09 23:38:23 +00:00
Patrick Walton	00f6eb7a1c	Implement the infrastructure needed to support portals and mirrors. (#13797 ) Implement the infrastructure needed to support portals and mirrors. Bevy currently supports multiple cameras and rendering to off-screen render targets, so one might naïvely think that the engine has support for portals and mirrors already. However, Bevy is missing two key features that enable portals and mirrors at present: 1. Bevy has support for neither custom clip planes nor oblique clip planes. This prevents the construction of proper portals or mirrors, as meshes that intersect the portal plane must be clipped to render properly. 2. Bevy has no support for cameras that invert the culling mode, so meshes that are reflected across a plane will render inside-out. This PR addresses the two issues above: 1. This commit introduces a new field on `PerspectiveProjection`, `near_plane`, which allows the application to specify a custom near plane. That feature fully enables [Lengyel oblique clipping], which is the most optimal way to achieve a custom near clipping plane. It allows us to avoid having to support custom clip planes, which are often implemented inefficiently in hardware. 2. This patch adds a new field on the `Camera` component, `invert_culling`. This field causes the Bevy renderer to invert the front face setting when rendering the objects visible from that camera. When coupled with an appropriately-set [Householder matrix] on the camera, this allows correct rendering of objects reflected across a plane. Additionally, this PR adds a new function to `bevy_math::mat3`, `reflection_matrix`. This generates the matrix that reflects objects across a plane, suitable for encoding into a `Transform`. It's fully documented for ease of use. Finally, a new example, `mirror`, has been added. This example is a complete instance of a working mirror, combining a camera with a Householder matrix, oblique projection, and inverted culling with a custom material to render an animated mesh and its planar reflection. The camera and mesh may be moved with the mouse, and the off-screen render target that stores the rendered contents of the mirror world is properly resized when the user resizes the window. [Lengyel oblique clipping]: https://terathon.com/lengyel/Lengyel-Oblique.pdf [Householder matrix]: https://en.wikipedia.org/wiki/Householder_transformation <img width="2564" height="1500" alt="Screenshot 2025-12-05 212155" src="https://github.com/user-attachments/assets/35652b58-a9a5-415a-bdff-367889a23b9f" />	2025-12-09 23:08:15 +00:00
Patrick Walton	185712fbef	Add support for normal maps, metallic-roughness maps, and emissive maps to clustered decals. (#22039 ) This commit expands the number of textures associated with each clustered decal from 1 to 4. The additional 3 textures apply normal maps, metallic-roughness maps, and emissive maps respectively to the surfaces onto which decals are projected. Normal maps are combined using the [Whiteout blending method] from SIGGRAPH 2007. This approach was chosen because, subjectively, it appeared better than the more complex [reoriented normal mapping (RNM)] approach. Additionally, Whiteout normal map blending is commutative and associative, unlike RNM, which is a useful property for our decals, which are currently applied in an unspecified order. (The fact that the order in which our decals are applied is unspecified is unfortunate, but is a long-standing issue and should probably be fixed in a followup.) In particular, commutativity is desirable because otherwise one must specify which normal map is the base normal map and which normal map is the detail normal map, but that's not a policy decision that Bevy can unconditionally make, as decals aren't necessary more detailed than the base normal map. (For instance, consider a bullet hole decal embedded in a wall with a subtle rough texture; one might reasonably argue that the base material's normal map is the detail map and the bullet hole is the base map, even though the bullet hole's normal map comes from a decal.) Note that, with a custom material shader, it's possible for application code to use the decal images for arbitrary other purposes. For example, with a custom shader an application might use the metallic-roughness map as a clearcoat map instead if it has no need for a metallic-roughness map on a decal. And, of course, a custom material shader could adopt RNM blending for decals if it wishes. A new example, `clustered_decal_maps`, has been added. This example demonstrates the new maps by spawning clustered decals with maps randomly over time and projecting them onto a wall. <img width="2564" height="1500" alt="Screenshot 2025-12-05 095953" src="https://github.com/user-attachments/assets/255fca64-2b42-4794-a367-14336d023310" />	2025-12-09 18:14:55 +00:00
shunkie	0410482c13	Remove unused `num_workgroups` from game_of_life shader (#21944 ) # Objective Remove unused `num_workgroups`. ## Testing ``` cargo run --example compute_shader_game_of_life ```	2025-11-26 04:03:02 +00:00
Patrick Walton	89f9dcb431	Don't require cameras to have color render targets. (#20830 ) It can occasionally be useful to have cameras that only render prepasses such as depth. Other game engines such as Unity support this feature by allowing a depth-only render target to be assigned to a camera. Bevy, however, has no easy mechanism for this. (Creating an `ShadowView` in the render app doesn't work, because various places in rendering assume that shadow views are associated with lights.) This patch fixes the problem by introducing a new type of `RenderTarget`, `RenderTarget::None`. Cameras with no render target will skip the main opaque and transparent render passes, but any prepasses on such cameras will still occur. Adding a `DepthPrepass` to such a camera enables depth-only cameras, with maximum efficiency as the fragment shader won't exist and no color buffer will be bound. Note that, when no render target is specified, the physical size of the viewport must be explicitly specified, as Bevy has no other mechanism to determine it. A new example, `render_depth_to_texture`, has been added, containing a rotating cube and a depth-only camera orbiting it. The depth texture that the camera produces is rendered onto a plane using a custom shader. (NB: In such scenarios, the depth texture must be copied from the camera to a custom image due to (a) the `wgpu` limitation that a depth texture can't be both a render target and bindable as a texture and (b) the fact that Bevy depth textures are managed by Bevy itself and exposed only to the render world. The example uses a custom render node to perform the copy.) The depth-only camera can be moved using the WASD keys. <img width="2564" height="1500" alt="Screenshot 2025-09-02 080508" src="https://github.com/user-attachments/assets/415e7f4d-393d-4be3-b569-829c06901078" />	2025-09-03 03:18:39 +00:00
charlotte 🌸	b6922f98d1	Revert bevy_sprite_render rename in shaders (#20644 ) Fixes #20643	2025-08-18 23:11:28 +00:00
Rob Parrett	3560b112f4	Fix imports in some 2d examples with custom shaders (#20639 ) # Objective Fixes #20615 ## Solution These shaders weren't updated when the import moved in #20587. Fix the imports. ## Testing ``` cargo run --example custom_gltf_vertex_attribute cargo run --example shader_material_2d cargo run --example mesh2d_manual ``` Co-authored-by: François Mockers <francois.mockers@vleue.com>	2025-08-18 21:59:54 +00:00
dontgetfoundout	5bc5a1325a	Update Game of Life compute example to include a uniform buffer variable (#20466 ) # Objective It is currently a little unclear how to use uniform buffers in compute shaders. The other examples of uniform buffers in the Bevy examples and codebase either are built on Materials or use `DynamicUniformBuffer`s created from a `ViewNode`. Neither of these are a great fit for use in a compute shader. ## Solution Update the compute shader example to pass a uniform buffer to the shader that determines the color for alive cells. ## Discussion Topics - Is this the right way to pass this data to the shader? - Should we be encouraging use of uniform buffers in compute shaders at all? Some in the community prefer the ergonomics of storage buffers in most (all?) compute shader cases. Do we want to push users to use storage buffers instead? - I took the idea to use color as the input from IceSentry on Discord, but this did require me to change the texture format to support non-red colors. Does this undermine the goals of the shader example? Is this the wrong texture format? ## Testing - Did you test these changes? If so, how? - The changes were manually validated with a number of different `LinearRgba` values for `alive_color` - Are there any parts that need more testing? - How can other people (reviewers) test your changes? Is there anything specific they need to know? - ` cargo run --example compute_shader_game_of_life` - Color can be set using `alive_color` property on `GameOfLifeUniforms` - If relevant, what platforms did you test these changes on, and are there any important ones you can't test? - Manually validated on Windows and WASM (WebGPU) targets - WASM WebGL2 doesn't appear to support textures in compute shaders --- ## Showcase <img width="1602" height="939" alt="image" src="https://github.com/user-attachments/assets/9a535617-a179-4f20-b686-596899f11d18" /> --------- Co-authored-by: dontgetfoundout <inflatedego@gmail.com>	2025-08-11 22:52:02 +00:00
charlotte 🌸	e6ec2c181d	Material bind group shader def (#20069 ) Use a shader def for the material bind group index to make it easier for when we want to switch back to group 2 in the future without breaking everyone again. --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com> Co-authored-by: atlv <email@atlasdostal.com> Co-authored-by: atlas dostal <rodol@rivalrebels.com>	2025-08-06 05:09:12 +00:00
Gilles Henaux	ca25a67d0d	Fix the extended_material example on WebGL2 (#18812 ) # Objective - Fixes #13872 (also mentioned in #17167) ## Solution - Added conditional padding fields to the shader uniform ## Alternatives ### 1- Use a UVec4 Replace the `u32` field in `MyExtension` by a `UVec4` and only use the `x` coordinate. (This was the original approach, but for consistency with the rest of the codebase, separate padding fields seem to be preferred) ### 2- Don't fix it, unlist it While the fix is quite simple, it does muddy the waters a tiny bit due to `quantize_steps` now being a UVec4 instead of a simple u32. We could simply remove this example from the examples that support WebGL2. ## Testing - Ran the example locally on WebGL2 (and native Vulkan) successfully	2025-07-07 19:34:12 +00:00
Nicky Fahey	831073105f	Add comment to custom vertex attribute example to make it easier to convert to 2D (#18603 ) # Objective - It's not clear what changes are needed to the shader to convert the example to 2D. - If you leave the shader unchanged you get a very confusing error (see linked issue). - Fixes #14077 ## Solution A separate example probably isn't needed as there is little difference between 3D and 2D, but a note saying what changes are needed to the shader would make it a lot easier. Let me know if you think it is also worth adding some notes to the rust file, but it is mostly trivial changes such as changing `Mesh3d` to `Mesh2d`. I have left the original code in comments next to the changes in the gist linked at the bottom if you wish to compare. ## Testing - I just spent a long time working it out the hard way. This would have made it a lot quicker. - I have tested the 2D version of the shader with the changes explained in the suggested comment and it works as expected. - For testing purposes [here is a complete working 2D example](https://gist.github.com/nickyfahey/647e2a2c45e695f24e288432b811dfc2). (note that as per the original example the shader file needs to go in 'assets/shaders/')	2025-07-07 19:26:37 +00:00
charlotte 🌸	e6ba9a6d18	Type erased materials (#19667 ) # Objective Closes #18075 In order to enable a number of patterns for dynamic materials in the engine, it's necessary to decouple the renderer from the `Material` trait. This opens the possibility for: - Materials that aren't coupled to `AsBindGroup`. - 2d using the underlying 3d bindless infrastructure. - Dynamic materials that can change their layout at runtime. - Materials that aren't even backed by a Rust struct at all. ## Solution In short, remove all trait bounds from render world material systems and resources. This means moving a bunch of stuff onto `MaterialProperties` and engaging in some hacks to make specialization work. Rather than storing the bind group data in `MaterialBindGroupAllocator`, right now we're storing it in a closure on `MaterialProperties`. TBD if this has bad performance characteristics. ## Benchmarks - `many_cubes`: `cargo run --example many_cubes --release --features=bevy/trace_tracy -- --vary-material-data-per-instance`: ![Screenshot 2025-06-26 235426](https://github.com/user-attachments/assets/10a0ee29-9932-4f91-ab43-33518b117ac5) - @DGriffin91's Caldera `cargo run --release --features=bevy/trace_tracy -- --random-materials` ![image](https://github.com/user-attachments/assets/ef91ba6a-8e88-4922-a73f-acb0af5b0dbc) - @DGriffin91's Caldera with 20 unique material types (i.e. `MaterialPlugin<M>`) and random materials per mesh `cargo run --release --features=bevy/trace_tracy -- --random-materials` ![Screenshot 2025-06-27 000425](https://github.com/user-attachments/assets/9561388b-881d-46cf-8c3d-b15b3e9aedc7) ### TODO - We almost certainly lost some parallelization from removing the type params that could be gained back from smarter iteration. - Test all the things that could have broken. - ~Fix meshlets~ ## Showcase See [the example](https://github.com/bevyengine/bevy/pull/19667/files#diff-9d768cfe1c3aa81eff365d250d3cbe5a63e8df63e81dd85f64c3c3cd993f6d94) for a custom material implemented without the use of the `Material` trait and thus `AsBindGroup`. ![image](https://github.com/user-attachments/assets/e3fcca7c-e04e-4a4e-9d89-39d697a9e3b8) --------- Co-authored-by: IceSentry <IceSentry@users.noreply.github.com> Co-authored-by: IceSentry <c.giguere42@gmail.com>	2025-06-27 22:57:24 +00:00
charlotte 🌸	96dcbc5f8c	Ugrade to `wgpu` version `25.0` (#19563 ) # Objective Upgrade to `wgpu` version `25.0`. Depends on https://github.com/bevyengine/naga_oil/pull/121 ## Solution ### Problem The biggest issue we face upgrading is the following requirement: > To facilitate this change, there was an additional validation rule put in place: if there is a binding array in a bind group, you may not use dynamic offset buffers or uniform buffers in that bind group. This requirement comes from vulkan rules on UpdateAfterBind descriptors. This is a major difficulty for us, as there are a number of binding arrays that are used in the view bind group. Note, this requirement does not affect merely uniform buffors that use dynamic offset but the use of any uniform in a bind group that also has a binding array. ### Attempted fixes The easiest fix would be to change uniforms to be storage buffers whenever binding arrays are in use: ```wgsl #ifdef BINDING_ARRAYS_ARE_USED @group(0) @binding(0) var<uniform> view: View; @group(0) @binding(1) var<uniform> lights: types::Lights; #else @group(0) @binding(0) var<storage> view: array<View>; @group(0) @binding(1) var<storage> lights: array<types::Lights>; #endif ``` This requires passing the view index to the shader so that we know where to index into the buffer: ```wgsl struct PushConstants { view_index: u32, } var<push_constant> push_constants: PushConstants; ``` Using push constants is no problem because binding arrays are only usable on native anyway. However, this greatly complicates the ability to access `view` in shaders. For example: ```wgsl #ifdef BINDING_ARRAYS_ARE_USED mesh_view_bindings::view.view_from_world[0].z #else mesh_view_bindings::view[mesh_view_bindings::view_index].view_from_world[0].z #endif ``` Using this approach would work but would have the effect of polluting our shaders with ifdef spam basically everywhere. Why not use a function? Unfortunately, the following is not valid wgsl as it returns a binding directly from a function in the uniform path. ```wgsl fn get_view() -> View { #if BINDING_ARRAYS_ARE_USED let view_index = push_constants.view_index; let view = views[view_index]; #endif return view; } ``` This also poses problems for things like lights where we want to return a ptr to the light data. Returning ptrs from wgsl functions isn't allowed even if both bindings were buffers. The next attempt was to simply use indexed buffers everywhere, in both the binding array and non binding array path. This would be viable if push constants were available everywhere to pass the view index, but unfortunately they are not available on webgpu. This means either passing the view index in a storage buffer (not ideal for such a small amount of state) or using push constants sometimes and uniform buffers only on webgpu. However, this kind of conditional layout infects absolutely everything. Even if we were to accept just using storage buffer for the view index, there's also the additional problem that some dynamic offsets aren't actually per-view but per-use of a setting on a camera, which would require passing that uniform data on every camera regardless of whether that rendering feature is being used, which is also gross. As such, although it's gross, the simplest solution just to bump binding arrays into `@group(1)` and all other bindings up one bind group. This should still bring us under the device limit of 4 for most users. ### Next steps / looking towards the future I'd like to avoid needing split our view bind group into multiple parts. In the future, if `wgpu` were to add `@builtin(draw_index)`, we could build a list of draw state in gpu processing and avoid the need for any kind of state change at all (see https://github.com/gfx-rs/wgpu/issues/6823). This would also provide significantly more flexibility to handle things like offsets into other arrays that may not be per-view. ### Testing Tested a number of examples, there are probably more that are still broken. --------- Co-authored-by: François Mockers <mockersf@gmail.com> Co-authored-by: Elabajaba <Elabajaba@users.noreply.github.com>	2025-06-26 19:41:47 +00:00
Mathis Brossier	119eb51f00	Fix game_of_life shader relying on Naga bug (#18951 ) # Objective The game of life example shader relies on a Naga bug ([6397](https://github.com/gfx-rs/wgpu/issues/6397) / [4536](https://github.com/gfx-rs/wgpu/issues/4536)). In WGSL certain arithmetic operations must be explicitly parenthesized ([reference](https://www.w3.org/TR/WGSL/#operator-precedence-associativity)). Naga doesn't enforce that (and also the precedence order is [messed up](https://github.com/gfx-rs/wgpu/issues/4536#issuecomment-1780113990)). So this example may break soon. This is the only sample shader having this issue. ## Solution added parentheses ## Testing ran the example before and after the fix with `cargo run --example compute_shader_game_of_life`	2025-04-26 21:38:08 +00:00
Patrick Walton	dc7c8f228f	Add bindless support back to `ExtendedMaterial`. (#18025 ) PR #17898 disabled bindless support for `ExtendedMaterial`. This commit adds it back. It also adds a new example, `extended_material_bindless`, showing how to use it.	2025-04-09 15:34:44 +00:00
François Mockers	3945a6de3b	Fix wesl in wasm and webgl2 (#18591 ) # Objective - feature `shader_format_wesl` doesn't compile in Wasm - once fixed, example `shader_material_wesl` doesn't work in WebGL2 ## Solution - remove special path handling when loading shaders. this seems like a way to escape the asset folder which we don't want to allow, and can't compile on android or wasm, and can't work on iOS (filesystem is rooted there) - pad material so that it's 16 bits. I couldn't get conditional compilation to work in wesl for type declaration, it fails to parse - the shader renders the color `(0.0, 0.0, 0.0, 0.0)` when it's not a polka dot. this renders as black on WebGPU/metal/..., and white on WebGL2. change it to `(0.0, 0.0, 0.0, 1.0)` so that it's black everywhere	2025-03-28 21:45:02 +00:00
charlotte	35bf9753e8	Fixes for WESL on Windows (#18373 ) # Objective WESL was broken on windows. ## Solution - Upgrade to `wesl_rs` 1.2. - Fix path handling on windows. - Improve example for khronos demo this week.	2025-03-17 22:29:29 +00:00
Benjamin Brienen	c3ff6d4136	Fix non-crate typos (#18219 ) # Objective Correct spelling ## Solution Fix typos, specifically ones that I found in folders other than /crates ## Testing CI --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>	2025-03-11 06:17:48 +00:00
Patrick Walton	913eb46324	Reimplement bindless storage buffers. (#17994 ) Support for bindless storage buffers was temporarily removed with the bindless revamp. This commit restores that support.	2025-03-10 21:32:19 +00:00
charlotte	181445c56b	Add support for experimental WESL shader source (#17953 ) # Objective WESL's pre-MVP `0.1.0` has been [released](https://docs.rs/wesl/latest/wesl/)! Add support for WESL shader source so that we can begin playing and testing WESL, as well as aiding in their development. ## Solution Adds a `ShaderSource::WESL` that can be used to load `.wesl` shaders. Right now, we don't support mixing `naga-oil`. Additionally, WESL shaders currently need to pass through the naga frontend, which the WESL team is aware isn't great for performance (they're working on compiling into naga modules). Also, since our shaders are managed using the asset system, we don't currently support using file based imports like `super` or package scoped imports. Further work will be needed to asses how we want to support this. --- ## Showcase See the `shader_material_wesl` example. Be sure to press space to activate party mode (trigger conditional compilation)! https://github.com/user-attachments/assets/ec6ad19f-b6e4-4e9d-a00f-6f09336b08a4	2025-03-09 19:26:55 +00:00
Patrick Walton	28441337bb	Use global binding arrays for bindless resources. (#17898 ) Currently, Bevy's implementation of bindless resources is rather unusual: every binding in an object that implements `AsBindGroup` (most commonly, a material) becomes its own separate binding array in the shader. This is inefficient for two reasons: 1. If multiple materials reference the same texture or other resource, the reference to that resource will be duplicated many times. This increases `wgpu` validation overhead. 2. It creates many unused binding array slots. This increases `wgpu` and driver overhead and makes it easier to hit limits on APIs that `wgpu` currently imposes tight resource limits on, like Metal. This PR fixes these issues by switching Bevy to use the standard approach in GPU-driven renderers, in which resources are de-duplicated and passed as global arrays, one for each type of resource. Along the way, this patch introduces per-platform resource limits and bumps them from 16 resources per binding array to 64 resources per bind group on Metal and 2048 resources per bind group on other platforms. (Note that the number of resources per binding array isn't the same as the number of resources per bind group; as it currently stands, if all the PBR features are turned on, Bevy could pack as many as 496 resources into a single slab.) The limits have been increased because `wgpu` now has universal support for partially-bound binding arrays, which mean that we no longer need to fill the binding arrays with fallback resources on Direct3D 12. The `#[bindless(LIMIT)]` declaration when deriving `AsBindGroup` can now simply be written `#[bindless]` in order to have Bevy choose a default limit size for the current platform. Custom limits are still available with the new `#[bindless(limit(LIMIT))]` syntax: e.g. `#[bindless(limit(8))]`. The material bind group allocator has been completely rewritten. Now there are two allocators: one for bindless materials and one for non-bindless materials. The new non-bindless material allocator simply maintains a 1:1 mapping from material to bind group. The new bindless material allocator maintains a list of slabs and allocates materials into slabs on a first-fit basis. This unfortunately makes its performance O(number of resources per object * number of slabs), but the number of slabs is likely to be low, and it's planned to become even lower in the future with `wgpu` improvements. Resources are de-duplicated with in a slab and reference counted. So, for instance, if multiple materials refer to the same texture, that texture will exist only once in the appropriate binding array. To support these new features, this patch adds the concept of a bindless descriptor to the `AsBindGroup` trait. The bindless descriptor allows the material bind group allocator to probe the layout of the material, now that an array of `BindGroupLayoutEntry` records is insufficient to describe the group. The `#[derive(AsBindGroup)]` has been heavily modified to support the new features. The most important user-facing change to that macro is that the struct-level `uniform` attribute, `#[uniform(BINDING_NUMBER, StandardMaterial)]`, now reads `#[uniform(BINDLESS_INDEX, MATERIAL_UNIFORM_TYPE, binding_array(BINDING_NUMBER)]`, allowing the material to specify the binding number for the binding array that holds the uniform data. To make this patch simpler, I removed support for bindless `ExtendedMaterial`s, as well as field-level bindless uniform and storage buffers. I intend to add back support for these as a follow-up. Because they aren't in any released Bevy version yet, I figured this was OK. Finally, this patch updates `StandardMaterial` for the new bindless changes. Generally, code throughout the PBR shaders that looked like `base_color_texture[slot]` now looks like `bindless_2d_textures[material_indices[slot].base_color_texture]`. This patch fixes a system hang that I experienced on the [Caldera test] when running with `caldera --random-materials --texture-count 100`. The time per frame is around 19.75 ms, down from 154.2 ms in Bevy 0.14: a 7.8× speedup. [Caldera test]: https://github.com/DGriffin91/bevy_caldera_scene	2025-02-21 05:55:36 +00:00
ickshonpe	02985c3d56	`ui_material` example webgl2 fix (#17852 ) # Objective Fixes #17851 ## Solution Align the `slider` uniform to 16 bytes by making it a `vec4`. ## Testing Run the example using: ``` cargo run -p build-wasm-example -- --api webgl2 ui_material basic-http-server examples/wasm/ ```	2025-02-13 20:52:26 +00:00
charlotte	a861452d68	Add user supplied mesh tag (#17648 ) # Objective Because of mesh preprocessing, users cannot rely on `@builtin(instance_index)` in order to reference external data, as the instance index is not stable, either from frame to frame or relative to the total spawn order of mesh instances. ## Solution Add a user supplied mesh index that can be used for referencing external data when drawing instanced meshes. Closes #13373 ## Testing Benchmarked `many_cubes` showing no difference in total frame time. ## Showcase https://github.com/user-attachments/assets/80620147-aafc-4d9d-a8ee-e2149f7c8f3b --------- Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>	2025-02-10 22:38:13 +00:00
IceSentry	4ecbe001d5	Add a custom render phase example (#16916 ) # Objective - It's currently very hard for beginners and advanced users to get a full understanding of a complete render phase. ## Solution - Implement a full custom render phase - The render phase in the example is intended to show a custom stencil phase that renders the stencil in red directly on the screen --- ## Showcase <img width="1277" alt="image" src="https://github.com/user-attachments/assets/e9dc0105-4fb6-463f-ad53-0529b575fd28" /> ## Notes More docs to explain what is going on is still needed but the example works and can already help some people. We might want to consider using a batched phase and cold specialization in the future, but the example is already complex enough as it is. --------- Co-authored-by: Christopher Biscardi <chris@christopherbiscardi.com>	2025-02-10 21:17:37 +00:00
ickshonpe	c0ccc87738	UI material border radius (#15171 ) # Objective I wrote a box shadow UI material naively thinking I could use the border widths attribute to hold the border radius but it doesn't work as the border widths are automatically set in the extraction function. Need to send border radius to the shader seperately for it to be viable. ## Solution Add a `border_radius` vertex attribute to the ui material. This PR also removes the normalization of border widths for custom UI materials. The regular UI shader doesn't do this so it's a bit confusing and means you can't use the logic from `ui.wgsl` in your custom UI materials. ## Testing / Showcase Made a change to the `ui_material` example to display border radius: ```cargo run --example ui_material``` <img width="569" alt="corners" src="https://github.com/user-attachments/assets/36412736-a9ee-4042-aadd-68b9cafb17cb" />	2025-01-28 04:54:48 +00:00
Patrick Walton	fc831c390d	Implement basic clustered decal projectors. (#17315 ) This commit adds support for decal projectors to Bevy, allowing for textures to be projected on top of geometry. Decal projectors are clusterable objects, just as punctual lights and light probes are. This means that decals are only evaluated for objects within the conservative bounds of the projector, and they don't require a second pass. These clustered decals require support for bindless textures and as such currently don't work on WebGL 2, WebGPU, macOS, or iOS. For an alternative that doesn't require bindless, see PR #16600. I believe that both contact projective decals in #16600 and clustered decals are desirable to have in Bevy. Contact projective decals offer broader hardware and driver support, while clustered decals don't require the creation of bounding geometry. A new example, `decal_projectors`, has been added, which demonstrates multiple decals on a rotating object. The decal projectors can be scaled and rotated with the mouse. There are several limitations of this initial patch that can be addressed in follow-ups: 1. There's no way to specify the Z-index of decals. That is, the order in which multiple decals are blended on top of one another is arbitrary. A follow-up could introduce some sort of Z-index field so that artists can specify that some decals should be blended on top of others. 2. Decals don't take the normal of the surface they're projected onto into account. Most decal implementations in other engines have a feature whereby the angle between the decal projector and the normal of the surface must be within some threshold for the decal to appear. Often, artists can specify a fade-off range for a smooth transition between oblique surfaces and aligned surfaces. 3. There's no distance-based fadeoff toward the end of the projector range. Many decal implementations have this. This addresses #2401. ## Showcase ![Screenshot 2025-01-11 052913](https://github.com/user-attachments/assets/8fabbafc-60fb-461d-b715-d7977e10fe1f)	2025-01-26 20:13:39 +00:00
ickshonpe	51c3bf24b7	`custom_ui_material` border fix (#17282 ) # Objective The order of the border edges in `UiVertexOutput` is left, right, top, bottom but in `custom_ui_material` the selectors switch them so left is right and top is bottom. ## Solution Reverse the conditions so that the correct border values are selected.	2025-01-11 05:45:20 +00:00
Patrick Walton	a8f15bd95e	Introduce two-level bins for multidrawable meshes. (#16898 ) Currently, our batchable binned items are stored in a hash table that maps bin key, which includes the batch set key, to a list of entities. Multidraw is handled by sorting the bin keys and accumulating adjacent bins that can be multidrawn together (i.e. have the same batch set key) into multidraw commands during `batch_and_prepare_binned_render_phase`. This is reasonably efficient right now, but it will complicate future work to retain indirect draw parameters from frame to frame. Consider what must happen when we have retained indirect draw parameters and the application adds a bin (i.e. a new mesh) that shares a batch set key with some pre-existing meshes. (That is, the new mesh can be multidrawn with the pre-existing meshes.) To be maximally efficient, our goal in that scenario will be to update only the indirect draw parameters for the batch set (i.e. multidraw command) containing the mesh that was added, while leaving the others alone. That means that we have to quickly locate all the bins that belong to the batch set being modified. In the existing code, we would have to sort the list of bin keys so that bins that can be multidrawn together become adjacent to one another in the list. Then we would have to do a binary search through the sorted list to find the location of the bin that was just added. Next, we would have to widen our search to adjacent indexes that contain the same batch set, doing expensive comparisons against the batch set key every time. Finally, we would reallocate the indirect draw parameters and update the stored pointers to the indirect draw parameters that the bins store. By contrast, it'd be dramatically simpler if we simply changed the way bins are stored to first map from batch set key (i.e. multidraw command) to the bins (i.e. meshes) within that batch set key, and then from each individual bin to the mesh instances. That way, the scenario above in which we add a new mesh will be simpler to handle. First, we will look up the batch set key corresponding to that mesh in the outer map to find an inner map corresponding to the single multidraw command that will draw that batch set. We will know how many meshes the multidraw command is going to draw by the size of that inner map. Then we simply need to reallocate the indirect draw parameters and update the pointers to those parameters within the bins as necessary. There will be no need to do any binary search or expensive batch set key comparison: only a single hash lookup and an iteration over the inner map to update the pointers. This patch implements the above technique. Because we don't have retained bins yet, this PR provides no performance benefits. However, it opens the door to maximally efficient updates when only a small number of meshes change from frame to frame. The main churn that this patch causes is that the batch set key (which uniquely specifies a multidraw command) and bin key (which uniquely specifies a mesh within that multidraw command) are now separate, instead of the batch set key being embedded within the bin key. In order to isolate potential regressions, I think that at least #16890, #16836, and #16825 should land before this PR does. ## Migration Guide * The batch set key is now separate from the bin key in `BinnedPhaseItem`. The batch set key is used to collect multidrawable meshes together. If you aren't using the multidraw feature, you can safely set the batch set key to `()`.	2025-01-06 18:34:40 +00:00
Rob Parrett	651b22f31f	Update `typos` (#17126 ) # Objective Use the latest version of `typos` and fix the typos that it now detects # Additional Info By the way, `typos` has a "low priority typo suggestions issue" where we can throw typos we find that `typos` doesn't catch. (This link may go stale) https://github.com/crate-ci/typos/issues/1200	2025-01-03 17:44:26 +00:00
kurk070ff	3cd649b805	Fix inaccurate comment in custom_ui_material.wgsl shader (#16846 ) # Objective - Modify a comment in the shader file to describe what the shader actually does - Fixes #16830 ## Solution - Changed the comment. ## Testing - Testing is not relevant to fixing comments (as long as the comment is accurate) --------- Co-authored-by: Freya Pines <freya@MacBookAir.lan> Co-authored-by: Freya Pines <freya@Freyas-MacBook-Air.local>	2024-12-17 00:09:36 +00:00
Patrick Walton	35826be6f7	Implement bindless lightmaps. (#16653 ) This commit allows Bevy to bind 16 lightmaps at a time, if the current platform supports bindless textures. Naturally, if bindless textures aren't supported, Bevy falls back to binding only a single lightmap at a time. As lightmaps are usually heavily atlased, I doubt many scenes will use more than 16 lightmap textures. This has little performance impact now, but it's desirable for us to reap the benefits of multidraw and bindless textures on scenes that use lightmaps. Otherwise, we might have to break batches in order to switch those lightmaps. Additionally, this PR slightly reduces the cost of binning because it makes the lightmap index in `Opaque3dBinKey` 32 bits instead of an `AssetId`. ## Migration Guide * The `Opaque3dBinKey::lightmap_image` field is now `Opaque3dBinKey::lightmap_slab`, which is a lightweight identifier for an entire binding array of lightmaps.	2024-12-16 23:37:06 +00:00
Patrick Walton	b7bcd313ca	Cluster light probes using conservative spherical bounds. (#13746 ) This commit allows the Bevy renderer to use the clustering infrastructure for light probes (reflection probes and irradiance volumes) on platforms where at least 3 storage buffers are available. On such platforms (the vast majority), we stop performing brute-force searches of light probes for each fragment and instead only search the light probes with bounding spheres that intersect the current cluster. This should dramatically improve scalability of irradiance volumes and reflection probes. The primary platform that doesn't support 3 storage buffers is WebGL 2, and we continue using a brute-force search of light probes on that platform, as the UBO that stores per-cluster indices is too small to fit the light probe counts. Note, however, that that platform also doesn't support bindless textures (indeed, it would be very odd for a platform to support bindless textures but not SSBOs), so we only support one of each type of light probe per drawcall there in the first place. Consequently, this isn't a performance problem, as the search will only have one light probe to consider. (In fact, clustering would probably end up being a performance loss.) Known potential improvements include: 1. We currently cull based on a conservative bounding sphere test and not based on the oriented bounding box (OBB) of the light probe. This is improvable, but in the interests of simplicity, I opted to keep the bounding sphere test for now. The OBB improvement can be a follow-up. 2. This patch doesn't change the fact that each fragment only takes a single light probe into account. Typical light probe implementations detect the case in which multiple light probes cover the current fragment and perform some sort of weighted blend between them. As the light probe fetch function presently returns only a single light probe, implementing that feature would require more code restructuring, so I left it out for now. It can be added as a follow-up. 3. Light probe implementations typically have a falloff range. Although this is a wanted feature in Bevy, this particular commit also doesn't implement that feature, as it's out of scope. 4. This commit doesn't raise the maximum number of light probes past its current value of 8 for each type. This should be addressed later, but would possibly require more bindings on platforms with storage buffers, which would increase this patch's complexity. Even without raising the limit, this patch should constitute a significant performance improvement for scenes that get anywhere close to this limit. In the interest of keeping this patch small, I opted to leave raising the limit to a follow-up. ## Changelog ### Changed * Light probes (reflection probes and irradiance volumes) are now clustered on most platforms, improving performance when many light probes are present. --------- Co-authored-by: Benjamin Brienen <Benjamin.Brienen@outlook.com> Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>	2024-12-05 13:07:10 +00:00
Patrick Walton	5adf831b42	Add a bindless mode to `AsBindGroup`. (#16368 ) This patch adds the infrastructure necessary for Bevy to support bindless resources, by adding a new `#[bindless]` attribute to `AsBindGroup`. Classically, only a single texture (or sampler, or buffer) can be attached to each shader binding. This means that switching materials requires breaking a batch and issuing a new drawcall, even if the mesh is otherwise identical. This adds significant overhead not only in the driver but also in `wgpu`, as switching bind groups increases the amount of validation work that `wgpu` must do. Bindless resources are the typical solution to this problem. Instead of switching bindings between each texture, the renderer instead supplies a large array of all textures in the scene up front, and the material contains an index into that array. This pattern is repeated for buffers and samplers as well. The renderer now no longer needs to switch binding descriptor sets while drawing the scene. Unfortunately, as things currently stand, this approach won't quite work for Bevy. Two aspects of `wgpu` conspire to make this ideal approach unacceptably slow: 1. In the DX12 backend, all binding arrays (bindless resources) must have a constant size declared in the shader, and all textures in an array must be bound to actual textures. Changing the size requires a recompile. 2. Changing even one texture incurs revalidation of all textures, a process that takes time that's linear in the total size of the binding array. This means that declaring a large array of textures big enough to encompass the entire scene is presently unacceptably slow. For example, if you declare 4096 textures, then `wgpu` will have to revalidate all 4096 textures if even a single one changes. This process can take multiple frames. To work around this problem, this PR groups bindless resources into small slabs and maintains a free list for each. The size of each slab for the bindless arrays associated with a material is specified via the `#[bindless(N)]` attribute. For instance, consider the following declaration: ```rust #[derive(AsBindGroup)] #[bindless(16)] struct MyMaterial { #[buffer(0)] color: Vec4, #[texture(1)] #[sampler(2)] diffuse: Handle<Image>, } ``` The `#[bindless(N)]` attribute specifies that, if bindless arrays are supported on the current platform, each resource becomes a binding array of N instances of that resource. So, for `MyMaterial` above, the `color` attribute is exposed to the shader as `binding_array<vec4<f32>, 16>`, the `diffuse` texture is exposed to the shader as `binding_array<texture_2d<f32>, 16>`, and the `diffuse` sampler is exposed to the shader as `binding_array<sampler, 16>`. Inside the material's vertex and fragment shaders, the applicable index is available via the `material_bind_group_slot` field of the `Mesh` structure. So, for instance, you can access the current color like so: ```wgsl // `uniform` binding arrays are a non-sequitur, so `uniform` is automatically promoted // to `storage` in bindless mode. @group(2) @binding(0) var<storage> material_color: binding_array<Color, 4>; ... @fragment fn fragment(in: VertexOutput) -> @location(0) vec4<f32> { let color = material_color[mesh[in.instance_index].material_bind_group_slot]; ... } ``` Note that portable shader code can't guarantee that the current platform supports bindless textures. Indeed, bindless mode is only available in Vulkan and DX12. The `BINDLESS` shader definition is available for your use to determine whether you're on a bindless platform or not. Thus a portable version of the shader above would look like: ```wgsl #ifdef BINDLESS @group(2) @binding(0) var<storage> material_color: binding_array<Color, 4>; #else // BINDLESS @group(2) @binding(0) var<uniform> material_color: Color; #endif // BINDLESS ... @fragment fn fragment(in: VertexOutput) -> @location(0) vec4<f32> { #ifdef BINDLESS let color = material_color[mesh[in.instance_index].material_bind_group_slot]; #else // BINDLESS let color = material_color; #endif // BINDLESS ... } ``` Importantly, this PR doesn't update `StandardMaterial` to be bindless. So, for example, `scene_viewer` will currently not run any faster. I intend to update `StandardMaterial` to use bindless mode in a follow-up patch. A new example, `shaders/shader_material_bindless`, has been added to demonstrate how to use this new feature. Here's a Tracy profile of `submit_graph_commands` of this patch and an additional patch (not submitted yet) that makes `StandardMaterial` use bindless. Red is those patches; yellow is `main`. The scene was Bistro Exterior with a hack that forces all textures to opaque. You can see a 1.47x mean speedup. ![Screenshot 2024-11-12 161713](https://github.com/user-attachments/assets/4334b362-42c8-4d64-9cfb-6835f019b95c) ## Migration Guide * `RenderAssets::prepare_asset` now takes an `AssetId` parameter. * Bin keys now have Bevy-specific material bind group indices instead of `wgpu` material bind group IDs, as part of the bindless change. Use the new `MaterialBindGroupAllocator` to map from bind group index to bind group ID.	2024-12-03 18:00:34 +00:00
Jake Swenson	16b39c2b36	examples(shaders/glsl): Update GLSL Shader Example Camera View uniform (#15865 ) # Objective The Custom Material GLSL shader example has an old version of the camera view uniform structure. This PR updates the example GLSL custom material shader to have the latest structure. ## Solution I was running into issues using the camera world position (it wasn't changing) and someone in discord pointed me to the source of truth. `crates/bevy_render/src/view/view.wgsl` After using this latest uniform structure in my project I'm now able to work with the camera position in my shader. ## Testing I tested this change by running the example with: ```bash cargo run --features shader_format_glsl --example shader_material_glsl ``` <img width="1392" alt="image" src="https://github.com/user-attachments/assets/39fc82ec-ff3b-4864-ad73-05f3a25db483"> --------- Co-authored-by: Carter Anderson <mcanders1@gmail.com>	2024-10-19 01:08:55 +00:00
charlotte	40c26f80aa	Gpu readback (#15419 ) # Objective Adds a new `Readback` component to request for readback of a `Handle<Image>` or `Handle<ShaderStorageBuffer>` to the CPU in a future frame. ## Solution We track the `Readback` component and allocate a target buffer to write the gpu resource into and map it back asynchronously, which then fires a trigger on the entity in the main world. This proccess is asynchronous, and generally takes a few frames. ## Showcase ```rust let mut buffer = ShaderStorageBuffer::from(vec![0u32; 16]); buffer.buffer_description.usage \|= BufferUsages::COPY_SRC; let buffer = buffers.add(buffer); commands .spawn(Readback::buffer(buffer.clone())) .observe(\|trigger: Trigger<ReadbackComplete>\| { info!("Buffer data from previous frame {:?}", trigger.event()); }); ``` --------- Co-authored-by: Kristoffer Søholm <k.soeholm@gmail.com> Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>	2024-09-30 17:28:55 +00:00
ickshonpe	09d2292016	Add a border to the UI material example (#15120 ) # Objective There aren't any examples of how to draw a ui material with borders. ## Solution Add border rendering to the `ui_material` example's shader. ## Showcase <img width="395" alt="bordermat" src="https://github.com/user-attachments/assets/109c59c1-f54b-4542-96f7-acff63f5057f"> --------- Co-authored-by: charlotte <charlotte.c.mcelwain@gmail.com>	2024-09-09 16:34:24 +00:00
charlotte	a4640046fc	Adds `ShaderStorageBuffer` asset (#14663 ) Adds a new `Handle<Storage>` asset type that can be used as a render asset, particularly for use with `AsBindGroup`. Closes: #13658 # Objective Allow users to create storage buffers in the main world without having to access the `RenderDevice`. While this resource is technically available, it's bad form to use in the main world and requires mixing rendering details with main world code. Additionally, this makes storage buffers easier to use with `AsBindGroup`, particularly in the following scenarios: - Sharing the same buffers between a compute stage and material shader. We already have examples of this for storage textures (see game of life example) and these changes allow a similar pattern to be used with storage buffers. - Preventing repeated gpu upload (see the previous easier to use `Vec` `AsBindGroup` option). - Allow initializing custom materials using `Default`. Previously, the lack of a `Default` implement for the raw `wgpu::Buffer` type made implementing a `AsBindGroup + Default` bound difficult in the presence of buffers. ## Solution Adds a new `Handle<Storage>` asset type that is prepared into a `GpuStorageBuffer` render asset. This asset can either be initialized with a `Vec<u8>` of properly aligned data or with a size hint. Users can modify the underlying `wgpu::BufferDescriptor` to provide additional usage flags. ## Migration Guide The `AsBindGroup` `storage` attribute has been modified to reference the new `Handle<Storage>` asset instead. Usages of Vec` should be converted into assets instead. --------- Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>	2024-09-02 16:46:34 +00:00
IceSentry	bfcb19a871	Add example showing how to use SpecializedMeshPipeline (#14370 ) # Objective - A lot of mid-level rendering apis are hard to figure out because they don't have any examples - SpecializedMeshPipeline can be really useful in some cases when you want more flexibility than a Material without having to go to low level apis. ## Solution - Add an example showing how to make a custom `SpecializedMeshPipeline`. ## Testing - Did you test these changes? If so, how? - Are there any parts that need more testing? - How can other people (reviewers) test your changes? Is there anything specific they need to know? - If relevant, what platforms did you test these changes on, and are there any important ones you can't test? --- ## Showcase The examples just spawns 3 triangles in a triangle pattern. ![image](https://github.com/user-attachments/assets/c3098758-94c4-4775-95e5-1d7c7fb9eb86) --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>	2024-07-31 18:24:58 +00:00
IceSentry	011f71a245	Update ui_material example to be a slider instead (#14031 ) # Objective - Some people have asked how to do image masking in UI. It's pretty easy to do using a `UiMaterial` assuming you know how to write shaders. ## Solution - Update the ui_material example to show the bevy banner slowly being revealed like a progress bar ## Notes I'm not entirely sure if we want this or not. For people that would be comfortable to use this for their own games they would probably have already figured out how to do it and for people that aren't familiar with shaders this isn't really enough to make an actual slider/progress bar. --------- Co-authored-by: François Mockers <francois.mockers@vleue.com>	2024-06-27 21:23:04 +00:00
Patrick Walton	44db8b7fac	Allow phase items not associated with meshes to be binned. (#14029 ) As reported in #14004, many third-party plugins, such as Hanabi, enqueue entities that don't have meshes into render phases. However, the introduction of indirect mode added a dependency on mesh-specific data, breaking this workflow. This is because GPU preprocessing requires that the render phases manage indirect draw parameters, which don't apply to objects that aren't meshes. The existing code skips over binned entities that don't have indirect draw parameters, which causes the rendering to be skipped for such objects. To support this workflow, this commit adds a new field, `non_mesh_items`, to `BinnedRenderPhase`. This field contains a simple list of (bin key, entity) pairs. After drawing batchable and unbatchable objects, the non-mesh items are drawn one after another. Bevy itself doesn't enqueue any items into this list; it exists solely for the application and/or plugins to use. Additionally, this commit switches the asset ID in the standard bin keys to be an untyped asset ID rather than that of a mesh. This allows more flexibility, allowing bins to be keyed off any type of asset. This patch adds a new example, `custom_phase_item`, which simultaneously serves to demonstrate how to use this new feature and to act as a regression test so this doesn't break again. Fixes #14004. ## Changelog ### Added * `BinnedRenderPhase` now contains a `non_mesh_items` field for plugins to add custom items to.	2024-06-27 16:13:03 +00:00
JMS55	c50a4d8821	Remove unused mip_bias parameter from apply_normal_mapping (#13752 ) Mip bias is no longer used here	2024-06-10 13:00:34 +00:00

1 2 3

131 Commits