Commit Graph

131 Commits

Author SHA1 Message Date
Visse 6f270d4776 Expose pipeline constants to materials (#24502)
# Objective

Allow materials to use pipeline constants ([pipeline-overridable
constants](https://www.w3.org/TR/WGSL/#override-decls)).
They are already available in wgpu, but bevy didn't expose them.

## Solution

Expose constants in `RenderPipelineDescriptor` &
`ComputePipelineDescriptor`, allowing materials to specify them in their
specilize function.

**Note:** I had to remove the `Eq` derive from
`ComputePipelineDescriptor`, `VertexState` and `FragmentState`, due to
the new f64 field. It was already a bit inconsistent with
`RenderPipelineDescriptor` not having it.

## Testing
- Ran `cargo check`
- Created an example & ran it
- Couldn't run `cargo test` due to it taking looots of disk space to
run, but I have a hard time seeing it break something at runtime

## Showcase

See the added example, where pipeline constants are used to change the
`LEVELS` override in WGSL.
<img width="760" height="289" alt="Screenshot from 2026-05-31 09-46-05"
src="https://github.com/user-attachments/assets/6902757c-aea4-4b91-9ff0-e653ce4c3448"
/>
2026-06-09 00:06:01 +00:00
Luo Zhihao 7517c61ecd Add options to lower precision/compressed vertex buffer (#21926)
# Objective

Resolves #21902.

## Solution

This PR adopts a relatively transparent approach to reduce the GPU
vertex buffer size. On CPU-side mesh can still use uncompressed Float32
data, and users are not required to insert compressed vertex formats.
The vertex data is automatically processed into
lower-precision/octahedral encoded data when uploading to the GPU.

To enable vertex attribute compression, just set the
`attribute_compression` field of Mesh, or set
`mesh_attribute_compression` of GltfLoaderSettings. If enabled, normal
and tangent will be octahedral encoded Unorm16x2, uv0, uv1, joint weight
and color will be corresponding Unorm16 or Float16. I also provide
Unorm8x4 for vertex color if hdr isn't needed.

Update 2026-2-16

Removed previous approach that automatically compresses vertex buffer
according to flags when uploading to GPU. Instead, I added
`compressed_mesh` method to Mesh to construct compressed Mesh ahead of
time. GltfLoader can also opt-in mesh compressing when loading. I also
add an option to convert indices to u16, though I believe blender gltf
exporter already uses u16 indices when possible.


## Testing

Run `many_cubes`, `many_foxes`, `many_morph_targets` with
`--vertex-compression` to test 3d.
Run `bevymark` with `sprite_mesh` to test 2d, because `SpriteMesh` uses
compressed quad mesh now.

---------

Co-authored-by: Greeble <166992735+greeble-dev@users.noreply.github.com>
2026-06-02 03:22:56 +00:00
Christophe Dehais 5ec83e1185 Improve Order Independent Transparency example (#22781)
# Objectives

- Use a cleaner UI inspired by other examples
- Add a scene with custom material to fix #20297 

## Testing

Running the example
 ## Showcase 
<details>
<summary>Click to view showcase</summary>
<img width="872" height="732" alt="Screenshot From 2026-02-06 11-35-19"
src="https://github.com/user-attachments/assets/cc294c33-bb53-4ed4-9dce-7558f3bb8fee"
/>
<img width="872" height="732" alt="Screenshot From 2026-02-06 11-35-30"
src="https://github.com/user-attachments/assets/c2b79651-c99e-4bcc-8089-518d11631b9f"
/>
</details>

---------

Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2026-05-21 15:13:25 +00:00
Duncan afd576f380 Alpha discard (#11895)
# Objective

I want to call the `pbr_functions::alpha_discard` shader function from
my own material shader, but it only takes a `StandardMaterial`
parameter.

## Solution

This PR replaces the `StandardMaterial` parameter with less restrictive
ones so my shader can call it.

---

## Changelog

- The signature of `pbr_functions::alpha_discard` changed by replacing
`StandardMaterial` with only the `flags` and `alpha_cutoff` fields.

## Migration Guide

Replace this:

```wgsl
pbr_functions::alpha_discard(material, ...);
```

with this:

```wgsl
pbr_functions::alpha_discard(material.flags, material.alpha_cutoff, ...);
```

---------

Co-authored-by: Duncan Fairbanks <duncanfairbanks6@gmail.com>
2026-05-21 08:49:32 +00:00
Patrick Walton 1b0f112d90 Implement GPU clustering for lights, light probes, and decals. (#23036)
Currently, Bevy clusters lights on the CPU. This is generally not
considered a best practice any longer, and it can be a bottleneck in
workloads like `many_lights`. Moreover, it prevents GPU systems like
[Hanabi] from creating clusterable objects such as lights and decals
without a round trip to the CPU.

This PR introduces GPU light clustering when supported by the hardware.
The algorithm is the same as the existing GPU light clustering, but
parallelized over all clusters, and the resulting on-GPU format for
clusters is unchanged. GPU light clustering uses the hardware rasterizer
for compute purposes as a way to automatically distribute workloads
within 2D axis-aligned bounding boxes without actually rendering any
pixels, a first for Bevy. The algorithm is as follows, with each step
corresponding to a raster or compute command:

1. *Z slicing*: We have a 3D cluster froxel grid of size WxHxD and seek
to rasterize D axis-aligned quads, each of size WxH, representing the
range of each clusterable object. In this compute phase, we generate D
indirect instances for each clusterable object for the subsequent
indirect draws.

2. *Count rasterization*: We use instanced indirect drawing to rasterize
each quad generated in step 1 to a viewport of size WxH, with color
writes disabled. Each rasterized fragment represents a cluster-object
pair. In the fragment shader, we check to see if the object intersects
the cluster, and, if it does, we atomically bump a counter corresponding
to the number of objects of the given type intersecting the cluster in
question. We don't record the ID of the object in this phase; we simply
count the number of objects.

3. *Local allocation*: Now that we know the number of objects of each
type in each cluster, we can proceed to allocate space in the clustered
object buffer for each clustered object list. To do this, we need to
perform a [*prefix sum*] operation so that each list is tightly packed
with the others. For example, if adjacent clusters have 2, 5, and 3
objects, they'll be allocated at offsets 0, 2, and 7 respectively. This
*local* step uses a [Hillis-Steele scan] in shared memory to compute the
prefix sum of each chunk of 256 clusters. We can't go beyond 256
clusters in this local step because 256 is the maximum workgroup size in
`wgpu`.

4. *Global allocation*: To deal with the fact that we can't calculate
prefix sums beyond 256 clusters in step 3, we employ this second step
that does a sequential loop over every 256-cluster chunk, propagating
the prefix sum. At the end of this step, every list of clustered objects
is allocated.

5. *Populate rasterization*: Finally, we issue an instanced indirect
draw command using the same parameters as step (2). We test each
cluster-object pair for intersection, and, if the test passes, we record
the ID of each clustered object into the correct space in the list,
using a scratch pad buffer of atomics to store the position of the next
object in each list.

The buffer of clustered objects has a fixed size and can overflow. We
detect this condition via asynchronous CPU readback and automatically
grow the buffer for subsequent frames. In this case, we also log a
message so that the developer can choose a larger initial buffer size
and avoid any incorrect frames. Additionally, like #22874, the automatic
clustering heuristics are dynamically adjusted from frame to frame, by
recording statistics on the GPU and using CPU readback to download them
back to the CPU for processing.

As part of this PR, I refactored clustered visibility so that clustered
objects go through the same `ViewVisibility` system as other objects,
instead of using `VisibleClusterableObjects`. This was a nice
simplification.

On the `many_lights` benchmark, with about 8,000 lights visible out of
100,000, this process takes approximately 0.099 ms on my NVIDIA GeForce
RTX 4070 Laptop GPU. The AMD Ryzen 9 8945HS CPU, however, takes 2.12 ms
to do the same task. The GPU version is therefore a 21x speedup.

`main` `assign_objects_to_clusters` time, 2.12 ms:
<img width="2756" height="1800" alt="Screenshot 2026-02-17 222757"
src="https://github.com/user-attachments/assets/66341ad2-96f2-4e4a-87ee-fe3462bc05de"
/>

GPU clustering GPU time, 0.099 ms:
<img width="2756" height="1800" alt="Screenshot 2026-02-17 222458"
src="https://github.com/user-attachments/assets/18e2e0ae-a946-4b80-b38a-0543e76ebc02"
/>

`main`, 5.71 ms median frame time, 175 FPS:
<img width="2756" height="1800" alt="Screenshot 2026-02-17 222243"
src="https://github.com/user-attachments/assets/111c8e22-414f-4ee1-95fa-d7cfe422c2ab"
/>

GPU clustering, 4.88 ms median frame time, 205 FPS:
<img width="2756" height="1800" alt="Screenshot 2026-02-17 222256"
src="https://github.com/user-attachments/assets/0a662e88-a1b9-49c8-8bab-cc12b46cd079"
/>

[Hanabi]: https://github.com/djeedai/bevy_hanabi

[*prefix sum*]: https://en.wikipedia.org/wiki/Prefix_sum

[Hillis-Steele scan]:
https://en.wikipedia.org/wiki/Prefix_sum#Algorithm_1:_Shorter_span,_more_parallel


## Alice's PM Note from @kfc35

Fixes https://github.com/bevyengine/bevy/issues/22957 and also fixes
https://github.com/bevyengine/bevy/issues/22904.
2026-02-28 17:07:33 +00:00
Jordan Halase 5e1630bfd8 Fix 16 byte alignment typo (WebGL 2: 16 bit -> 16 byte) (#23124)
# Objective

WebGL 2 requires 16 **byte** UBO alignment. Some comments incorrectly
state 16 **bits**.

## Solution

Fix comment typos.

## Testing

N/A

---

## Showcase

N/A
2026-02-24 00:51:54 +00:00
Chris Biscardi 71ce303ec2 compute-shader mesh generation example (#22296)
# Objective

People have been asking how to get a compute shader-built mesh into
bevy's "stuff".

Some people want to control the lifetime of the mesh via Handle, and
others don't don't how to set data in bind groups.

## Solution

a new example that shows how to initialize a mesh handle with a
render_world usage mesh, and then put the output of the compute shader
into the mesh_allocator slab for the mesh.

The demo creates a scene with a camera, light, a circular base mesh, and
an empty "cube to be" mesh that is shared by cloning the handle across
two entities. The compute shader then fills in the data directly into
the mesh_allocator slabs for the vertex/index buffers.

If the compute shader failed, there would be no cube meshes showing as
the data would be empty.

## Testing

```
cargo run --example compute_mesh
```

---

## Showcase


<img width="3392" height="2106" alt="screenshot-2025-12-29-at-16 06
48@2x"
src="https://github.com/user-attachments/assets/88d8fed4-e3c1-418e-bb04-6f08d673403a"
/>
2026-01-16 00:06:16 +00:00
Patrick Walton 4189ba072d Provide a mechanism for applications to invoke the single-pass downsampler. (#22286)
The [AMD FidelityFX single-pass downsampler] (SPD) is the fastest way to
generate mipmap levels of a texture. Bevy currently has two separate
ports of that algorithm to WGSL: one for use in the environment map
generation and one for use on the depth buffer for the purposes of
occlusion culling (though the latter isn't the best use of it). Absent
is any mechanism to use the single-pass downsampler to generate mipmap
levels of a color texture for typical use in rendering. This is a
standard feature in game engines: for example, Unity has
[`GenerateMips`] and Unreal has [`bAutoGenerateMips`].

This PR adds a mechanism by which applications can invoke SPD to
generate mipmap levels for any `Image`. Using this mechanism is a two
step process. First, the application adds the `Handle<Image>` to a
resource, `MipGenerationJobs` and associates it with a *phase*, which is
an arbitrary ID chosen by the application. Second, the application adds
a `MipGenerationNode` for that phase to the render graph. During
rendering, the `MipGenerationNode` invokes SPD to generate a full mipmap
chain for all textures in that phase.

The reason why mipmap generation jobs are associated with phases is that
the generation of mipmaps may need to occur at precise points in the
application rendering cycle. For example, consider the common situation
of a mipmapped portal texture. The mipmaps must be generated *after* the
portal is rendered, but *before* the object in the main world displaying
the portal texture is drawn. The phased approach taken in this PR allows
complex dependencies like this to be expressed using the node graph
feature that Bevy already possesses. (In the future, if render graphs
are removed in favor of systems, this approach can naturally be reframed
in terms of systems, so this patch contains no hazards in that regard.)

Note that this patch by itself doesn't automatically generate mipmaps
for imported textures that don't have them the way that
[`bevy_mod_mipmap_generator`] does, in order to keep this patch
relatively small and self-contained. However, it'd be straightforward to
either (a) extend `bevy_mod_mipmap_generator`, (b) write another plugin,
and/or (c) add a new feature to Bevy itself, all built on top of this
PR, to support automatic GPU mip generation for image assets that don't
have them.

A new example, `dynamic_mip_generation`, has been added. This is a 2D
example that produces a texture at runtime on the CPU and invokes the
new `MipGenerationNode` that this patch adds to generate mipmaps for
that texture at runtime. The colors of the texture are randomly
generated, and UI for the example allows the texture to be regenerated
and for the size to be adjusted; this proves that the mipmap levels for
the texture are indeed generated at runtime and not pre-calculated at
build time. Note that, although the example is 2D, the feature that this
patch adds can be equally used in 2D and 3D.

[AMD FidelityFX single-pass downsampler]:
https://gpuopen.com/fidelityfx-spd/

[`GenerateMips`]:
https://docs.unity3d.com/ScriptReference/Rendering.CommandBuffer.GenerateMips.html

[`bAutoGenerateMips`]:
https://dev.epicgames.com/documentation/en-us/unreal-engine/API/Plugins/DisplayClusterConfiguration/FDisplayClusterC-_33/bAutoGenerateMips

[`bevy_mod_mipmap_generator`]:
https://github.com/DGriffin91/bevy_mod_mipmap_generator
2025-12-31 23:05:54 +00:00
IceSentry 681751647a Add FullscreenMaterial (#20414)
# Objective

- Users often want to run a fullscreen shader but the current solution
involves copying the custom_post_processing example which is a 350 line
file with a lot of low level wgpu complexity. Users shouldn't have to
deal with that just to make a fullscreen shader

## Solution

- Introduce a new FullscreenMaterial trait and FullscsreenMaterialPlugin
- This new material will run a fullscreen triangle with the specified
shader. It builds on top of the existing FullscreenShader infrastructure
- It lets user customize the node ordering. There's no defaults right
now becausae it's intended as a bit of a primitive plugin. Eventually we
could have some kind of default for custom post processing

## Testing

Made a new fullscreen_material example and made sure it works

## Follow up

Once this is merged there are various things that should be done to
improve it. Add the option to bind the depth texture, offer defaults for
post processing, use a full AsBindGroup, add a way to bind the gbuffer.

---------

Co-authored-by: JMS55 <47158642+JMS55@users.noreply.github.com>
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2025-12-14 22:35:20 +00:00
mgi388 d60a1b8166 Add MeshTag to array_texture example to demonstrate layer selection in shader (#21989)
## Objective

- When looking at the `array_texture` example, it wasn't clear to me how
I could send the "layer" to the GPU, but it turns out that [MeshTag is
one recommended
way](https://discord.com/channels/691052431525675048/866787577687310356/1444495450999754823)
to pass this.
- The example previously extracted a fake "layer" from the world
position, but IIUC this isn't the most realistic way to demonstrate
layer selection.

## Solution

- Update the `array_texture` example by using `MeshTag`.
- Add a system to the example that periodically changes the `MeshTag` on
entities to show that the mesh tag can also change dynamically at
runtime (and show how easy it is).

## Testing and showcase

Before, you can see each cube's texture is fixed.

<img width="1280" height="747" alt="image"
src="https://github.com/user-attachments/assets/1ffde7db-8110-4431-b4e8-3a5a4ba5c5db"
/>

After, you can see each cube's texture changes as time passes.


https://github.com/user-attachments/assets/b1227659-5886-4d2c-a401-84b80423c798

----

I'm hoping for a rendering dev to validate this approach is correct, and
useful. I think it is, but [I'm only just starting to
understand](https://discord.com/channels/691052431525675048/866787577687310356/1444888786478829668)
how to use this stuff and it's [possibly not the only
way](https://discord.com/channels/691052431525675048/866787577687310356/1444888304020488365)
so I don't want to submit this if it's the wrong approach to teach
future me's.
2025-12-09 23:38:23 +00:00
Patrick Walton 00f6eb7a1c Implement the infrastructure needed to support portals and mirrors. (#13797)
Implement the infrastructure needed to support portals and mirrors.

Bevy currently supports multiple cameras and rendering to off-screen
render targets, so one might naïvely think that the engine has support
for portals and mirrors already. However, Bevy is missing two key
features that enable portals and mirrors at present:

1. Bevy has support for neither custom clip planes nor oblique clip
planes. This prevents the construction of proper portals or mirrors, as
meshes that intersect the portal plane must be clipped to render
properly.

2. Bevy has no support for cameras that invert the culling mode, so
meshes that are reflected across a plane will render inside-out.

This PR addresses the two issues above:

1. This commit introduces a new field on `PerspectiveProjection`,
`near_plane`, which allows the application to specify a custom near
plane. That feature fully enables [Lengyel oblique clipping], which is
the most optimal way to achieve a custom near clipping plane. It allows
us to avoid having to support custom clip planes, which are often
implemented inefficiently in hardware.

2. This patch adds a new field on the `Camera` component,
`invert_culling`. This field causes the Bevy renderer to invert the
front face setting when rendering the objects visible from that camera.
When coupled with an appropriately-set [Householder matrix] on the
camera, this allows correct rendering of objects reflected across a
plane.

Additionally, this PR adds a new function to `bevy_math::mat3`,
`reflection_matrix`. This generates the matrix that reflects objects
across a plane, suitable for encoding into a `Transform`. It's fully
documented for ease of use.

Finally, a new example, `mirror`, has been added. This example is a
complete instance of a working mirror, combining a camera with a
Householder matrix, oblique projection, and inverted culling with a
custom material to render an animated mesh and its planar reflection.
The camera and mesh may be moved with the mouse, and the off-screen
render target that stores the rendered contents of the mirror world is
properly resized when the user resizes the window.

[Lengyel oblique clipping]:
https://terathon.com/lengyel/Lengyel-Oblique.pdf

[Householder matrix]:
https://en.wikipedia.org/wiki/Householder_transformation

<img width="2564" height="1500" alt="Screenshot 2025-12-05 212155"
src="https://github.com/user-attachments/assets/35652b58-a9a5-415a-bdff-367889a23b9f"
/>
2025-12-09 23:08:15 +00:00
Patrick Walton 185712fbef Add support for normal maps, metallic-roughness maps, and emissive maps to clustered decals. (#22039)
This commit expands the number of textures associated with each
clustered decal from 1 to 4. The additional 3 textures apply normal
maps, metallic-roughness maps, and emissive maps respectively to the
surfaces onto which decals are projected.

Normal maps are combined using the [*Whiteout* blending method] from
SIGGRAPH 2007. This approach was chosen because, subjectively, it
appeared better than the more complex [*reoriented normal mapping*
(RNM)] approach. Additionally, *Whiteout* normal map blending is
commutative and associative, unlike RNM, which is a useful property for
our decals, which are currently applied in an unspecified order. (The
fact that the order in which our decals are applied is unspecified is
unfortunate, but is a long-standing issue and should probably be fixed
in a followup.) In particular, commutativity is desirable because
otherwise one must specify which normal map is the *base* normal map and
which normal map is the *detail* normal map, but that's not a policy
decision that Bevy can unconditionally make, as decals aren't necessary
more detailed than the base normal map. (For instance, consider a bullet
hole decal embedded in a wall with a subtle rough texture; one might
reasonably argue that the base material's normal map is the detail map
and the bullet hole is the base map, even though the bullet hole's
normal map comes from a decal.)

Note that, with a custom material shader, it's possible for application
code to use the decal images for arbitrary other purposes. For example,
with a custom shader an application might use the metallic-roughness map
as a clearcoat map instead if it has no need for a metallic-roughness
map on a decal. And, of course, a custom material shader could adopt RNM
blending for decals if it wishes.

A new example, `clustered_decal_maps`, has been added. This example
demonstrates the new maps by spawning clustered decals with maps
randomly over time and projecting them onto a wall.

<img width="2564" height="1500" alt="Screenshot 2025-12-05 095953"
src="https://github.com/user-attachments/assets/255fca64-2b42-4794-a367-14336d023310"
/>
2025-12-09 18:14:55 +00:00
shunkie 0410482c13 Remove unused num_workgroups from game_of_life shader (#21944)
# Objective

Remove unused `num_workgroups`.

## Testing

```
cargo run --example compute_shader_game_of_life
```
2025-11-26 04:03:02 +00:00
Patrick Walton 89f9dcb431 Don't require cameras to have color render targets. (#20830)
It can occasionally be useful to have cameras that *only* render
prepasses such as depth. Other game engines such as Unity support this
feature by allowing a depth-only render target to be assigned to a
camera. Bevy, however, has no easy mechanism for this. (Creating an
`ShadowView` in the render app doesn't work, because various places in
rendering assume that shadow views are associated with lights.)

This patch fixes the problem by introducing a new type of
`RenderTarget`, `RenderTarget::None`. Cameras with no render target will
skip the main opaque and transparent render passes, but any prepasses on
such cameras will still occur. Adding a `DepthPrepass` to such a camera
enables depth-only cameras, with maximum efficiency as the fragment
shader won't exist and no color buffer will be bound.

Note that, when no render target is specified, the physical size of the
viewport must be explicitly specified, as Bevy has no other mechanism to
determine it.

A new example, `render_depth_to_texture`, has been added, containing a
rotating cube and a depth-only camera orbiting it. The depth texture
that the camera produces is rendered onto a plane using a custom shader.
(NB: In such scenarios, the depth texture must be copied from the camera
to a custom image due to (a) the `wgpu` limitation that a depth texture
can't be both a render target and bindable as a texture and (b) the fact
that Bevy depth textures are managed by Bevy itself and exposed only to
the render world. The example uses a custom render node to perform the
copy.) The depth-only camera can be moved using the WASD keys.

<img width="2564" height="1500" alt="Screenshot 2025-09-02 080508"
src="https://github.com/user-attachments/assets/415e7f4d-393d-4be3-b569-829c06901078"
/>
2025-09-03 03:18:39 +00:00
charlotte 🌸 b6922f98d1 Revert bevy_sprite_render rename in shaders (#20644)
Fixes #20643
2025-08-18 23:11:28 +00:00
Rob Parrett 3560b112f4 Fix imports in some 2d examples with custom shaders (#20639)
# Objective

Fixes #20615

## Solution

These shaders weren't updated when the import moved in #20587.

Fix the imports.

## Testing

```
cargo run --example custom_gltf_vertex_attribute
cargo run --example shader_material_2d
cargo run --example mesh2d_manual
```

Co-authored-by: François Mockers <francois.mockers@vleue.com>
2025-08-18 21:59:54 +00:00
dontgetfoundout 5bc5a1325a Update Game of Life compute example to include a uniform buffer variable (#20466)
# Objective
It is currently a little unclear how to use uniform buffers in compute
shaders. The other examples of uniform buffers in the Bevy examples and
codebase either are built on Materials or use `DynamicUniformBuffer`s
created from a `ViewNode`. Neither of these are a great fit for use in a
compute shader.

## Solution
Update the compute shader example to pass a uniform buffer to the shader
that determines the color for alive cells.

## Discussion Topics
- Is this the right way to pass this data to the shader?
- Should we be encouraging use of uniform buffers in compute shaders at
all? Some in the community prefer the ergonomics of storage buffers in
most (all?) compute shader cases. Do we want to push users to use
storage buffers instead?
- I took the idea to use color as the input from IceSentry on Discord,
but this did require me to change the texture format to support non-red
colors. Does this undermine the goals of the shader example? Is this the
wrong texture format?

## Testing

- Did you test these changes? If so, how?
- The changes were manually validated with a number of different
`LinearRgba` values for `alive_color`
- Are there any parts that need more testing?
- How can other people (reviewers) test your changes? Is there anything
specific they need to know?
  - ` cargo run --example compute_shader_game_of_life`
- Color can be set using `alive_color` property on `GameOfLifeUniforms`
- If relevant, what platforms did you test these changes on, and are
there any important ones you can't test?
  -  Manually validated on Windows and WASM (WebGPU) targets
    - WASM WebGL2 doesn't appear to support textures in compute shaders

---

## Showcase
<img width="1602" height="939" alt="image"
src="https://github.com/user-attachments/assets/9a535617-a179-4f20-b686-596899f11d18"
/>

---------

Co-authored-by: dontgetfoundout <inflatedego@gmail.com>
2025-08-11 22:52:02 +00:00
charlotte 🌸 e6ec2c181d Material bind group shader def (#20069)
Use a shader def for the material bind group index to make it easier for
when we want to switch back to group 2 in the future without breaking
everyone again.

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: atlv <email@atlasdostal.com>
Co-authored-by: atlas dostal <rodol@rivalrebels.com>
2025-08-06 05:09:12 +00:00
Gilles Henaux ca25a67d0d Fix the extended_material example on WebGL2 (#18812)
# Objective

- Fixes #13872 (also mentioned in #17167)

## Solution

- Added conditional padding fields to the shader uniform

## Alternatives

### 1- Use a UVec4

Replace the `u32` field in `MyExtension` by a `UVec4` and only use the
`x` coordinate.

(This was the original approach, but for consistency with the rest of
the codebase, separate padding fields seem to be preferred)

### 2- Don't fix it, unlist it

While the fix is quite simple, it does muddy the waters a tiny bit due
to `quantize_steps` now being a UVec4 instead of a simple u32. We could
simply remove this example from the examples that support WebGL2.

## Testing

- Ran the example locally on WebGL2 (and native Vulkan) successfully
2025-07-07 19:34:12 +00:00
Nicky Fahey 831073105f Add comment to custom vertex attribute example to make it easier to convert to 2D (#18603)
# Objective

- It's not clear what changes are needed to the shader to convert the
example to 2D.
- If you leave the shader unchanged you get a very confusing error (see
linked issue).
- Fixes #14077

## Solution

A separate example probably isn't needed as there is little difference
between 3D and 2D, but a note saying what changes are needed to the
shader would make it a lot easier.

Let me know if you think it is also worth adding some notes to the rust
file, but it is mostly trivial changes such as changing `Mesh3d` to
`Mesh2d`. I have left the original code in comments next to the changes
in the gist linked at the bottom if you wish to compare.

## Testing

- I just spent a long time working it out the hard way. This would have
made it a lot quicker.
- I have tested the 2D version of the shader with the changes explained
in the suggested comment and it works as expected.
- For testing purposes [here is a complete working 2D
example](https://gist.github.com/nickyfahey/647e2a2c45e695f24e288432b811dfc2).
(note that as per the original example the shader file needs to go in
'assets/shaders/')
2025-07-07 19:26:37 +00:00
charlotte 🌸 e6ba9a6d18 Type erased materials (#19667)
# Objective

Closes #18075

In order to enable a number of patterns for dynamic materials in the
engine, it's necessary to decouple the renderer from the `Material`
trait.

This opens the possibility for:
- Materials that aren't coupled to `AsBindGroup`.
- 2d using the underlying 3d bindless infrastructure.
- Dynamic materials that can change their layout at runtime.
- Materials that aren't even backed by a Rust struct at all.

## Solution

In short, remove all trait bounds from render world material systems and
resources. This means moving a bunch of stuff onto `MaterialProperties`
and engaging in some hacks to make specialization work. Rather than
storing the bind group data in `MaterialBindGroupAllocator`, right now
we're storing it in a closure on `MaterialProperties`. TBD if this has
bad performance characteristics.

## Benchmarks

- `many_cubes`:
`cargo run --example many_cubes --release --features=bevy/trace_tracy --
--vary-material-data-per-instance`:
![Screenshot 2025-06-26
235426](https://github.com/user-attachments/assets/10a0ee29-9932-4f91-ab43-33518b117ac5)

- @DGriffin91's Caldera
`cargo run --release --features=bevy/trace_tracy -- --random-materials`

![image](https://github.com/user-attachments/assets/ef91ba6a-8e88-4922-a73f-acb0af5b0dbc)


- @DGriffin91's Caldera with 20 unique material types (i.e.
`MaterialPlugin<M>`) and random materials per mesh
`cargo run --release --features=bevy/trace_tracy -- --random-materials`
![Screenshot 2025-06-27
000425](https://github.com/user-attachments/assets/9561388b-881d-46cf-8c3d-b15b3e9aedc7)


### TODO

- We almost certainly lost some parallelization from removing the type
params that could be gained back from smarter iteration.
- Test all the things that could have broken.
- ~Fix meshlets~

## Showcase

See [the
example](https://github.com/bevyengine/bevy/pull/19667/files#diff-9d768cfe1c3aa81eff365d250d3cbe5a63e8df63e81dd85f64c3c3cd993f6d94)
for a custom material implemented without the use of the `Material`
trait and thus `AsBindGroup`.


![image](https://github.com/user-attachments/assets/e3fcca7c-e04e-4a4e-9d89-39d697a9e3b8)

---------

Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>
Co-authored-by: IceSentry <c.giguere42@gmail.com>
2025-06-27 22:57:24 +00:00
charlotte 🌸 96dcbc5f8c Ugrade to wgpu version 25.0 (#19563)
# Objective

Upgrade to `wgpu` version `25.0`.

Depends on https://github.com/bevyengine/naga_oil/pull/121

## Solution

### Problem

The biggest issue we face upgrading is the following requirement:
> To facilitate this change, there was an additional validation rule put
in place: if there is a binding array in a bind group, you may not use
dynamic offset buffers or uniform buffers in that bind group. This
requirement comes from vulkan rules on UpdateAfterBind descriptors.

This is a major difficulty for us, as there are a number of binding
arrays that are used in the view bind group. Note, this requirement does
not affect merely uniform buffors that use dynamic offset but the use of
*any* uniform in a bind group that also has a binding array.

### Attempted fixes

The easiest fix would be to change uniforms to be storage buffers
whenever binding arrays are in use:
```wgsl
#ifdef BINDING_ARRAYS_ARE_USED
@group(0) @binding(0) var<uniform> view: View;
@group(0) @binding(1) var<uniform> lights: types::Lights;
#else
@group(0) @binding(0) var<storage> view: array<View>;
@group(0) @binding(1) var<storage> lights: array<types::Lights>;
#endif
```

This requires passing the view index to the shader so that we know where
to index into the buffer:

```wgsl
struct PushConstants {
    view_index: u32,
}

var<push_constant> push_constants: PushConstants;
```

Using push constants is no problem because binding arrays are only
usable on native anyway.

However, this greatly complicates the ability to access `view` in
shaders. For example:
```wgsl
#ifdef BINDING_ARRAYS_ARE_USED
mesh_view_bindings::view.view_from_world[0].z
#else
mesh_view_bindings::view[mesh_view_bindings::view_index].view_from_world[0].z
#endif
```

Using this approach would work but would have the effect of polluting
our shaders with ifdef spam basically *everywhere*.

Why not use a function? Unfortunately, the following is not valid wgsl
as it returns a binding directly from a function in the uniform path.

```wgsl
fn get_view() -> View {
#if BINDING_ARRAYS_ARE_USED
    let view_index = push_constants.view_index;
    let view = views[view_index];
#endif
    return view;
}
```

This also poses problems for things like lights where we want to return
a ptr to the light data. Returning ptrs from wgsl functions isn't
allowed even if both bindings were buffers.

The next attempt was to simply use indexed buffers everywhere, in both
the binding array and non binding array path. This would be viable if
push constants were available everywhere to pass the view index, but
unfortunately they are not available on webgpu. This means either
passing the view index in a storage buffer (not ideal for such a small
amount of state) or using push constants sometimes and uniform buffers
only on webgpu. However, this kind of conditional layout infects
absolutely everything.

Even if we were to accept just using storage buffer for the view index,
there's also the additional problem that some dynamic offsets aren't
actually per-view but per-use of a setting on a camera, which would
require passing that uniform data on *every* camera regardless of
whether that rendering feature is being used, which is also gross.

As such, although it's gross, the simplest solution just to bump binding
arrays into `@group(1)` and all other bindings up one bind group. This
should still bring us under the device limit of 4 for most users.

### Next steps / looking towards the future

I'd like to avoid needing split our view bind group into multiple parts.
In the future, if `wgpu` were to add `@builtin(draw_index)`, we could
build a list of draw state in gpu processing and avoid the need for any
kind of state change at all (see
https://github.com/gfx-rs/wgpu/issues/6823). This would also provide
significantly more flexibility to handle things like offsets into other
arrays that may not be per-view.

### Testing

Tested a number of examples, there are probably more that are still
broken.

---------

Co-authored-by: François Mockers <mockersf@gmail.com>
Co-authored-by: Elabajaba <Elabajaba@users.noreply.github.com>
2025-06-26 19:41:47 +00:00
Mathis Brossier 119eb51f00 Fix game_of_life shader relying on Naga bug (#18951)
# Objective

The game of life example shader relies on a Naga bug
([6397](https://github.com/gfx-rs/wgpu/issues/6397) /
[4536](https://github.com/gfx-rs/wgpu/issues/4536)). In WGSL certain
arithmetic operations must be explicitly parenthesized
([reference](https://www.w3.org/TR/WGSL/#operator-precedence-associativity)).
Naga doesn't enforce that (and also the precedence order is [messed
up](https://github.com/gfx-rs/wgpu/issues/4536#issuecomment-1780113990)).

So this example may break soon. This is the only sample shader having
this issue.

## Solution

added parentheses

## Testing

ran the example before and after the fix with `cargo run --example
compute_shader_game_of_life`
2025-04-26 21:38:08 +00:00
Patrick Walton dc7c8f228f Add bindless support back to ExtendedMaterial. (#18025)
PR #17898 disabled bindless support for `ExtendedMaterial`. This commit
adds it back. It also adds a new example, `extended_material_bindless`,
showing how to use it.
2025-04-09 15:34:44 +00:00
François Mockers 3945a6de3b Fix wesl in wasm and webgl2 (#18591)
# Objective

- feature `shader_format_wesl` doesn't compile in Wasm
- once fixed, example `shader_material_wesl` doesn't work in WebGL2

## Solution

- remove special path handling when loading shaders. this seems like a
way to escape the asset folder which we don't want to allow, and can't
compile on android or wasm, and can't work on iOS (filesystem is rooted
there)
- pad material so that it's 16 bits. I couldn't get conditional
compilation to work in wesl for type declaration, it fails to parse
- the shader renders the color `(0.0, 0.0, 0.0, 0.0)` when it's not a
polka dot. this renders as black on WebGPU/metal/..., and white on
WebGL2. change it to `(0.0, 0.0, 0.0, 1.0)` so that it's black
everywhere
2025-03-28 21:45:02 +00:00
charlotte 35bf9753e8 Fixes for WESL on Windows (#18373)
# Objective

WESL was broken on windows.

## Solution

- Upgrade to `wesl_rs` 1.2.
- Fix path handling on windows.
- Improve example for khronos demo this week.
2025-03-17 22:29:29 +00:00
Benjamin Brienen c3ff6d4136 Fix non-crate typos (#18219)
# Objective

Correct spelling

## Solution

Fix typos, specifically ones that I found in folders other than /crates

## Testing

CI

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2025-03-11 06:17:48 +00:00
Patrick Walton 913eb46324 Reimplement bindless storage buffers. (#17994)
Support for bindless storage buffers was temporarily removed with the
bindless revamp. This commit restores that support.
2025-03-10 21:32:19 +00:00
charlotte 181445c56b Add support for experimental WESL shader source (#17953)
# Objective

WESL's pre-MVP `0.1.0` has been
[released](https://docs.rs/wesl/latest/wesl/)!

Add support for WESL shader source so that we can begin playing and
testing WESL, as well as aiding in their development.

## Solution

Adds a `ShaderSource::WESL` that can be used to load `.wesl` shaders.

Right now, we don't support mixing `naga-oil`. Additionally, WESL
shaders currently need to pass through the naga frontend, which the WESL
team is aware isn't great for performance (they're working on compiling
into naga modules). Also, since our shaders are managed using the asset
system, we don't currently support using file based imports like `super`
or package scoped imports. Further work will be needed to asses how we
want to support this.

---

## Showcase

See the `shader_material_wesl` example. Be sure to press space to
activate party mode (trigger conditional compilation)!


https://github.com/user-attachments/assets/ec6ad19f-b6e4-4e9d-a00f-6f09336b08a4
2025-03-09 19:26:55 +00:00
Patrick Walton 28441337bb Use global binding arrays for bindless resources. (#17898)
Currently, Bevy's implementation of bindless resources is rather
unusual: every binding in an object that implements `AsBindGroup` (most
commonly, a material) becomes its own separate binding array in the
shader. This is inefficient for two reasons:

1. If multiple materials reference the same texture or other resource,
the reference to that resource will be duplicated many times. This
increases `wgpu` validation overhead.

2. It creates many unused binding array slots. This increases `wgpu` and
driver overhead and makes it easier to hit limits on APIs that `wgpu`
currently imposes tight resource limits on, like Metal.

This PR fixes these issues by switching Bevy to use the standard
approach in GPU-driven renderers, in which resources are de-duplicated
and passed as global arrays, one for each type of resource.

Along the way, this patch introduces per-platform resource limits and
bumps them from 16 resources per binding array to 64 resources per bind
group on Metal and 2048 resources per bind group on other platforms.
(Note that the number of resources per *binding array* isn't the same as
the number of resources per *bind group*; as it currently stands, if all
the PBR features are turned on, Bevy could pack as many as 496 resources
into a single slab.) The limits have been increased because `wgpu` now
has universal support for partially-bound binding arrays, which mean
that we no longer need to fill the binding arrays with fallback
resources on Direct3D 12. The `#[bindless(LIMIT)]` declaration when
deriving `AsBindGroup` can now simply be written `#[bindless]` in order
to have Bevy choose a default limit size for the current platform.
Custom limits are still available with the new
`#[bindless(limit(LIMIT))]` syntax: e.g. `#[bindless(limit(8))]`.

The material bind group allocator has been completely rewritten. Now
there are two allocators: one for bindless materials and one for
non-bindless materials. The new non-bindless material allocator simply
maintains a 1:1 mapping from material to bind group. The new bindless
material allocator maintains a list of slabs and allocates materials
into slabs on a first-fit basis. This unfortunately makes its
performance O(number of resources per object * number of slabs), but the
number of slabs is likely to be low, and it's planned to become even
lower in the future with `wgpu` improvements. Resources are
de-duplicated with in a slab and reference counted. So, for instance, if
multiple materials refer to the same texture, that texture will exist
only once in the appropriate binding array.

To support these new features, this patch adds the concept of a
*bindless descriptor* to the `AsBindGroup` trait. The bindless
descriptor allows the material bind group allocator to probe the layout
of the material, now that an array of `BindGroupLayoutEntry` records is
insufficient to describe the group. The `#[derive(AsBindGroup)]` has
been heavily modified to support the new features. The most important
user-facing change to that macro is that the struct-level `uniform`
attribute, `#[uniform(BINDING_NUMBER, StandardMaterial)]`, now reads
`#[uniform(BINDLESS_INDEX, MATERIAL_UNIFORM_TYPE,
binding_array(BINDING_NUMBER)]`, allowing the material to specify the
binding number for the binding array that holds the uniform data.

To make this patch simpler, I removed support for bindless
`ExtendedMaterial`s, as well as field-level bindless uniform and storage
buffers. I intend to add back support for these as a follow-up. Because
they aren't in any released Bevy version yet, I figured this was OK.

Finally, this patch updates `StandardMaterial` for the new bindless
changes. Generally, code throughout the PBR shaders that looked like
`base_color_texture[slot]` now looks like
`bindless_2d_textures[material_indices[slot].base_color_texture]`.

This patch fixes a system hang that I experienced on the [Caldera test]
when running with `caldera --random-materials --texture-count 100`. The
time per frame is around 19.75 ms, down from 154.2 ms in Bevy 0.14: a
7.8× speedup.

[Caldera test]: https://github.com/DGriffin91/bevy_caldera_scene
2025-02-21 05:55:36 +00:00
ickshonpe 02985c3d56 ui_material example webgl2 fix (#17852)
# Objective

Fixes #17851

## Solution

Align the `slider` uniform to 16 bytes by making it a `vec4`.

## Testing

Run the example using:
```
cargo run -p build-wasm-example -- --api webgl2 ui_material
basic-http-server examples/wasm/
```
2025-02-13 20:52:26 +00:00
charlotte a861452d68 Add user supplied mesh tag (#17648)
# Objective

Because of mesh preprocessing, users cannot rely on
`@builtin(instance_index)` in order to reference external data, as the
instance index is not stable, either from frame to frame or relative to
the total spawn order of mesh instances.

## Solution

Add a user supplied mesh index that can be used for referencing external
data when drawing instanced meshes.

Closes #13373

## Testing

Benchmarked `many_cubes` showing no difference in total frame time.

## Showcase



https://github.com/user-attachments/assets/80620147-aafc-4d9d-a8ee-e2149f7c8f3b

---------

Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>
2025-02-10 22:38:13 +00:00
IceSentry 4ecbe001d5 Add a custom render phase example (#16916)
# Objective

- It's currently very hard for beginners and advanced users to get a
full understanding of a complete render phase.

## Solution

- Implement a full custom render phase
- The render phase in the example is intended to show a custom stencil
phase that renders the stencil in red directly on the screen

---

## Showcase

<img width="1277" alt="image"
src="https://github.com/user-attachments/assets/e9dc0105-4fb6-463f-ad53-0529b575fd28"
/>

## Notes

More docs to explain what is going on is still needed but the example
works and can already help some people.

We might want to consider using a batched phase and cold specialization
in the future, but the example is already complex enough as it is.

---------

Co-authored-by: Christopher Biscardi <chris@christopherbiscardi.com>
2025-02-10 21:17:37 +00:00
ickshonpe c0ccc87738 UI material border radius (#15171)
# Objective

I wrote a box shadow UI material naively thinking I could use the border
widths attribute to hold the border radius but it
doesn't work as the border widths are automatically set in the
extraction function. Need to send border radius to the shader seperately
for it to be viable.

## Solution

Add a `border_radius` vertex attribute to the ui material.

This PR also removes the normalization of border widths for custom UI
materials. The regular UI shader doesn't do this so it's a bit confusing
and means you can't use the logic from `ui.wgsl` in your custom UI
materials.

## Testing / Showcase

Made a change to the `ui_material` example to display border radius:

```cargo run --example ui_material```

<img width="569" alt="corners" src="https://github.com/user-attachments/assets/36412736-a9ee-4042-aadd-68b9cafb17cb" />
2025-01-28 04:54:48 +00:00
Patrick Walton fc831c390d Implement basic clustered decal projectors. (#17315)
This commit adds support for *decal projectors* to Bevy, allowing for
textures to be projected on top of geometry. Decal projectors are
clusterable objects, just as punctual lights and light probes are. This
means that decals are only evaluated for objects within the conservative
bounds of the projector, and they don't require a second pass.

These clustered decals require support for bindless textures and as such
currently don't work on WebGL 2, WebGPU, macOS, or iOS. For an
alternative that doesn't require bindless, see PR #16600. I believe that
both contact projective decals in #16600 and clustered decals are
desirable to have in Bevy. Contact projective decals offer broader
hardware and driver support, while clustered decals don't require the
creation of bounding geometry.

A new example, `decal_projectors`, has been added, which demonstrates
multiple decals on a rotating object. The decal projectors can be scaled
and rotated with the mouse.

There are several limitations of this initial patch that can be
addressed in follow-ups:

1. There's no way to specify the Z-index of decals. That is, the order
in which multiple decals are blended on top of one another is arbitrary.
A follow-up could introduce some sort of Z-index field so that artists
can specify that some decals should be blended on top of others.

2. Decals don't take the normal of the surface they're projected onto
into account. Most decal implementations in other engines have a feature
whereby the angle between the decal projector and the normal of the
surface must be within some threshold for the decal to appear. Often,
artists can specify a fade-off range for a smooth transition between
oblique surfaces and aligned surfaces.

3. There's no distance-based fadeoff toward the end of the projector
range. Many decal implementations have this.

This addresses #2401.
 
## Showcase

![Screenshot 2025-01-11
052913](https://github.com/user-attachments/assets/8fabbafc-60fb-461d-b715-d7977e10fe1f)
2025-01-26 20:13:39 +00:00
ickshonpe 51c3bf24b7 custom_ui_material border fix (#17282)
# Objective

The order of the border edges in `UiVertexOutput` is left, right, top,
bottom but in `custom_ui_material` the selectors switch them so left is
right and top is bottom.

## Solution

Reverse the conditions so that the correct border values are selected.
2025-01-11 05:45:20 +00:00
Patrick Walton a8f15bd95e Introduce two-level bins for multidrawable meshes. (#16898)
Currently, our batchable binned items are stored in a hash table that
maps bin key, which includes the batch set key, to a list of entities.
Multidraw is handled by sorting the bin keys and accumulating adjacent
bins that can be multidrawn together (i.e. have the same batch set key)
into multidraw commands during `batch_and_prepare_binned_render_phase`.

This is reasonably efficient right now, but it will complicate future
work to retain indirect draw parameters from frame to frame. Consider
what must happen when we have retained indirect draw parameters and the
application adds a bin (i.e. a new mesh) that shares a batch set key
with some pre-existing meshes. (That is, the new mesh can be multidrawn
with the pre-existing meshes.) To be maximally efficient, our goal in
that scenario will be to update *only* the indirect draw parameters for
the batch set (i.e. multidraw command) containing the mesh that was
added, while leaving the others alone. That means that we have to
quickly locate all the bins that belong to the batch set being modified.

In the existing code, we would have to sort the list of bin keys so that
bins that can be multidrawn together become adjacent to one another in
the list. Then we would have to do a binary search through the sorted
list to find the location of the bin that was just added. Next, we would
have to widen our search to adjacent indexes that contain the same batch
set, doing expensive comparisons against the batch set key every time.
Finally, we would reallocate the indirect draw parameters and update the
stored pointers to the indirect draw parameters that the bins store.

By contrast, it'd be dramatically simpler if we simply changed the way
bins are stored to first map from batch set key (i.e. multidraw command)
to the bins (i.e. meshes) within that batch set key, and then from each
individual bin to the mesh instances. That way, the scenario above in
which we add a new mesh will be simpler to handle. First, we will look
up the batch set key corresponding to that mesh in the outer map to find
an inner map corresponding to the single multidraw command that will
draw that batch set. We will know how many meshes the multidraw command
is going to draw by the size of that inner map. Then we simply need to
reallocate the indirect draw parameters and update the pointers to those
parameters within the bins as necessary. There will be no need to do any
binary search or expensive batch set key comparison: only a single hash
lookup and an iteration over the inner map to update the pointers.

This patch implements the above technique. Because we don't have
retained bins yet, this PR provides no performance benefits. However, it
opens the door to maximally efficient updates when only a small number
of meshes change from frame to frame.

The main churn that this patch causes is that the *batch set key* (which
uniquely specifies a multidraw command) and *bin key* (which uniquely
specifies a mesh *within* that multidraw command) are now separate,
instead of the batch set key being embedded *within* the bin key.

In order to isolate potential regressions, I think that at least #16890,
#16836, and #16825 should land before this PR does.

## Migration Guide

* The *batch set key* is now separate from the *bin key* in
`BinnedPhaseItem`. The batch set key is used to collect multidrawable
meshes together. If you aren't using the multidraw feature, you can
safely set the batch set key to `()`.
2025-01-06 18:34:40 +00:00
Rob Parrett 651b22f31f Update typos (#17126)
# Objective

Use the latest version of `typos` and fix the typos that it now detects

# Additional Info

By the way, `typos` has a "low priority typo suggestions issue" where we
can throw typos we find that `typos` doesn't catch.

(This link may go stale) https://github.com/crate-ci/typos/issues/1200
2025-01-03 17:44:26 +00:00
kurk070ff 3cd649b805 Fix inaccurate comment in custom_ui_material.wgsl shader (#16846)
# Objective

- Modify a comment in the shader file to describe what the shader
actually does
- Fixes #16830

## Solution

- Changed the comment.

## Testing

- Testing is not relevant to fixing comments (as long as the comment is
accurate)

---------

Co-authored-by: Freya Pines <freya@MacBookAir.lan>
Co-authored-by: Freya Pines <freya@Freyas-MacBook-Air.local>
2024-12-17 00:09:36 +00:00
Patrick Walton 35826be6f7 Implement bindless lightmaps. (#16653)
This commit allows Bevy to bind 16 lightmaps at a time, if the current
platform supports bindless textures. Naturally, if bindless textures
aren't supported, Bevy falls back to binding only a single lightmap at a
time. As lightmaps are usually heavily atlased, I doubt many scenes will
use more than 16 lightmap textures.

This has little performance impact now, but it's desirable for us to
reap the benefits of multidraw and bindless textures on scenes that use
lightmaps. Otherwise, we might have to break batches in order to switch
those lightmaps.

Additionally, this PR slightly reduces the cost of binning because it
makes the lightmap index in `Opaque3dBinKey` 32 bits instead of an
`AssetId`.

## Migration Guide

* The `Opaque3dBinKey::lightmap_image` field is now
`Opaque3dBinKey::lightmap_slab`, which is a lightweight identifier for
an entire binding array of lightmaps.
2024-12-16 23:37:06 +00:00
Patrick Walton b7bcd313ca Cluster light probes using conservative spherical bounds. (#13746)
This commit allows the Bevy renderer to use the clustering
infrastructure for light probes (reflection probes and irradiance
volumes) on platforms where at least 3 storage buffers are available. On
such platforms (the vast majority), we stop performing brute-force
searches of light probes for each fragment and instead only search the
light probes with bounding spheres that intersect the current cluster.
This should dramatically improve scalability of irradiance volumes and
reflection probes.

The primary platform that doesn't support 3 storage buffers is WebGL 2,
and we continue using a brute-force search of light probes on that
platform, as the UBO that stores per-cluster indices is too small to fit
the light probe counts. Note, however, that that platform also doesn't
support bindless textures (indeed, it would be very odd for a platform
to support bindless textures but not SSBOs), so we only support one of
each type of light probe per drawcall there in the first place.
Consequently, this isn't a performance problem, as the search will only
have one light probe to consider. (In fact, clustering would probably
end up being a performance loss.)

Known potential improvements include:

1. We currently cull based on a conservative bounding sphere test and
not based on the oriented bounding box (OBB) of the light probe. This is
improvable, but in the interests of simplicity, I opted to keep the
bounding sphere test for now. The OBB improvement can be a follow-up.

2. This patch doesn't change the fact that each fragment only takes a
single light probe into account. Typical light probe implementations
detect the case in which multiple light probes cover the current
fragment and perform some sort of weighted blend between them. As the
light probe fetch function presently returns only a single light probe,
implementing that feature would require more code restructuring, so I
left it out for now. It can be added as a follow-up.

3. Light probe implementations typically have a falloff range. Although
this is a wanted feature in Bevy, this particular commit also doesn't
implement that feature, as it's out of scope.

4. This commit doesn't raise the maximum number of light probes past its
current value of 8 for each type. This should be addressed later, but
would possibly require more bindings on platforms with storage buffers,
which would increase this patch's complexity. Even without raising the
limit, this patch should constitute a significant performance
improvement for scenes that get anywhere close to this limit. In the
interest of keeping this patch small, I opted to leave raising the limit
to a follow-up.

## Changelog

### Changed

* Light probes (reflection probes and irradiance volumes) are now
clustered on most platforms, improving performance when many light
probes are present.

---------

Co-authored-by: Benjamin Brienen <Benjamin.Brienen@outlook.com>
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2024-12-05 13:07:10 +00:00
Patrick Walton 5adf831b42 Add a bindless mode to AsBindGroup. (#16368)
This patch adds the infrastructure necessary for Bevy to support
*bindless resources*, by adding a new `#[bindless]` attribute to
`AsBindGroup`.

Classically, only a single texture (or sampler, or buffer) can be
attached to each shader binding. This means that switching materials
requires breaking a batch and issuing a new drawcall, even if the mesh
is otherwise identical. This adds significant overhead not only in the
driver but also in `wgpu`, as switching bind groups increases the amount
of validation work that `wgpu` must do.

*Bindless resources* are the typical solution to this problem. Instead
of switching bindings between each texture, the renderer instead
supplies a large *array* of all textures in the scene up front, and the
material contains an index into that array. This pattern is repeated for
buffers and samplers as well. The renderer now no longer needs to switch
binding descriptor sets while drawing the scene.

Unfortunately, as things currently stand, this approach won't quite work
for Bevy. Two aspects of `wgpu` conspire to make this ideal approach
unacceptably slow:

1. In the DX12 backend, all binding arrays (bindless resources) must
have a constant size declared in the shader, and all textures in an
array must be bound to actual textures. Changing the size requires a
recompile.

2. Changing even one texture incurs revalidation of all textures, a
process that takes time that's linear in the total size of the binding
array.

This means that declaring a large array of textures big enough to
encompass the entire scene is presently unacceptably slow. For example,
if you declare 4096 textures, then `wgpu` will have to revalidate all
4096 textures if even a single one changes. This process can take
multiple frames.

To work around this problem, this PR groups bindless resources into
small *slabs* and maintains a free list for each. The size of each slab
for the bindless arrays associated with a material is specified via the
`#[bindless(N)]` attribute. For instance, consider the following
declaration:

```rust
#[derive(AsBindGroup)]
#[bindless(16)]
struct MyMaterial {
    #[buffer(0)]
    color: Vec4,
    #[texture(1)]
    #[sampler(2)]
    diffuse: Handle<Image>,
}
```

The `#[bindless(N)]` attribute specifies that, if bindless arrays are
supported on the current platform, each resource becomes a binding array
of N instances of that resource. So, for `MyMaterial` above, the `color`
attribute is exposed to the shader as `binding_array<vec4<f32>, 16>`,
the `diffuse` texture is exposed to the shader as
`binding_array<texture_2d<f32>, 16>`, and the `diffuse` sampler is
exposed to the shader as `binding_array<sampler, 16>`. Inside the
material's vertex and fragment shaders, the applicable index is
available via the `material_bind_group_slot` field of the `Mesh`
structure. So, for instance, you can access the current color like so:

```wgsl
// `uniform` binding arrays are a non-sequitur, so `uniform` is automatically promoted
// to `storage` in bindless mode.
@group(2) @binding(0) var<storage> material_color: binding_array<Color, 4>;
...
@fragment
fn fragment(in: VertexOutput) -> @location(0) vec4<f32> {
    let color = material_color[mesh[in.instance_index].material_bind_group_slot];
    ...
}
```

Note that portable shader code can't guarantee that the current platform
supports bindless textures. Indeed, bindless mode is only available in
Vulkan and DX12. The `BINDLESS` shader definition is available for your
use to determine whether you're on a bindless platform or not. Thus a
portable version of the shader above would look like:

```wgsl
#ifdef BINDLESS
@group(2) @binding(0) var<storage> material_color: binding_array<Color, 4>;
#else // BINDLESS
@group(2) @binding(0) var<uniform> material_color: Color;
#endif // BINDLESS
...
@fragment
fn fragment(in: VertexOutput) -> @location(0) vec4<f32> {
#ifdef BINDLESS
    let color = material_color[mesh[in.instance_index].material_bind_group_slot];
#else // BINDLESS
    let color = material_color;
#endif // BINDLESS
    ...
}
```

Importantly, this PR *doesn't* update `StandardMaterial` to be bindless.
So, for example, `scene_viewer` will currently not run any faster. I
intend to update `StandardMaterial` to use bindless mode in a follow-up
patch.

A new example, `shaders/shader_material_bindless`, has been added to
demonstrate how to use this new feature.

Here's a Tracy profile of `submit_graph_commands` of this patch and an
additional patch (not submitted yet) that makes `StandardMaterial` use
bindless. Red is those patches; yellow is `main`. The scene was Bistro
Exterior with a hack that forces all textures to opaque. You can see a
1.47x mean speedup.
![Screenshot 2024-11-12
161713](https://github.com/user-attachments/assets/4334b362-42c8-4d64-9cfb-6835f019b95c)

## Migration Guide

* `RenderAssets::prepare_asset` now takes an `AssetId` parameter.
* Bin keys now have Bevy-specific material bind group indices instead of
`wgpu` material bind group IDs, as part of the bindless change. Use the
new `MaterialBindGroupAllocator` to map from bind group index to bind
group ID.
2024-12-03 18:00:34 +00:00
Jake Swenson 16b39c2b36 examples(shaders/glsl): Update GLSL Shader Example Camera View uniform (#15865)
# Objective
The Custom Material GLSL shader example has an old version of the camera
view uniform structure.
This PR updates the example GLSL custom material shader to have the
latest structure.


## Solution

I was running into issues using the camera world position (it wasn't
changing) and someone in discord pointed me to the source of truth.
  `crates/bevy_render/src/view/view.wgsl`

After using this latest uniform structure in my project I'm now able to
work with the camera position in my shader.

## Testing
I tested this change by running the example with:
```bash
cargo run --features shader_format_glsl --example shader_material_glsl
```
<img width="1392" alt="image"
src="https://github.com/user-attachments/assets/39fc82ec-ff3b-4864-ad73-05f3a25db483">

---------

Co-authored-by: Carter Anderson <mcanders1@gmail.com>
2024-10-19 01:08:55 +00:00
charlotte 40c26f80aa Gpu readback (#15419)
# Objective

Adds a new `Readback` component to request for readback of a
`Handle<Image>` or `Handle<ShaderStorageBuffer>` to the CPU in a future
frame.

## Solution

We track the `Readback` component and allocate a target buffer to write
the gpu resource into and map it back asynchronously, which then fires a
trigger on the entity in the main world. This proccess is asynchronous,
and generally takes a few frames.

## Showcase

```rust
let mut buffer = ShaderStorageBuffer::from(vec![0u32; 16]);
buffer.buffer_description.usage |= BufferUsages::COPY_SRC;
let buffer = buffers.add(buffer);

commands
    .spawn(Readback::buffer(buffer.clone()))
    .observe(|trigger: Trigger<ReadbackComplete>| {
        info!("Buffer data from previous frame {:?}", trigger.event());
    });
```

---------

Co-authored-by: Kristoffer Søholm <k.soeholm@gmail.com>
Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>
2024-09-30 17:28:55 +00:00
ickshonpe 09d2292016 Add a border to the UI material example (#15120)
# Objective

There aren't any examples of how to draw a ui material with borders.

## Solution

Add border rendering to the `ui_material` example's shader.

## Showcase

<img width="395" alt="bordermat"
src="https://github.com/user-attachments/assets/109c59c1-f54b-4542-96f7-acff63f5057f">

---------

Co-authored-by: charlotte <charlotte.c.mcelwain@gmail.com>
2024-09-09 16:34:24 +00:00
charlotte a4640046fc Adds ShaderStorageBuffer asset (#14663)
Adds a new `Handle<Storage>` asset type that can be used as a render
asset, particularly for use with `AsBindGroup`.

Closes: #13658 

# Objective

Allow users to create storage buffers in the main world without having
to access the `RenderDevice`. While this resource is technically
available, it's bad form to use in the main world and requires mixing
rendering details with main world code. Additionally, this makes storage
buffers easier to use with `AsBindGroup`, particularly in the following
scenarios:
- Sharing the same buffers between a compute stage and material shader.
We already have examples of this for storage textures (see game of life
example) and these changes allow a similar pattern to be used with
storage buffers.
- Preventing repeated gpu upload (see the previous easier to use `Vec`
`AsBindGroup` option).
- Allow initializing custom materials using `Default`. Previously, the
lack of a `Default` implement for the raw `wgpu::Buffer` type made
implementing a `AsBindGroup + Default` bound difficult in the presence
of buffers.

## Solution

Adds a new `Handle<Storage>` asset type that is prepared into a
`GpuStorageBuffer` render asset. This asset can either be initialized
with a `Vec<u8>` of properly aligned data or with a size hint. Users can
modify the underlying `wgpu::BufferDescriptor` to provide additional
usage flags.

## Migration Guide

The `AsBindGroup` `storage` attribute has been modified to reference the
new `Handle<Storage>` asset instead. Usages of Vec` should be converted
into assets instead.

---------

Co-authored-by: IceSentry <IceSentry@users.noreply.github.com>
2024-09-02 16:46:34 +00:00
IceSentry bfcb19a871 Add example showing how to use SpecializedMeshPipeline (#14370)
# Objective

- A lot of mid-level rendering apis are hard to figure out because they
don't have any examples
- SpecializedMeshPipeline can be really useful in some cases when you
want more flexibility than a Material without having to go to low level
apis.

## Solution

- Add an example showing how to make a custom `SpecializedMeshPipeline`.

## Testing

- Did you test these changes? If so, how?
- Are there any parts that need more testing?
- How can other people (reviewers) test your changes? Is there anything
specific they need to know?
- If relevant, what platforms did you test these changes on, and are
there any important ones you can't test?

---

## Showcase

The examples just spawns 3 triangles in a triangle pattern.


![image](https://github.com/user-attachments/assets/c3098758-94c4-4775-95e5-1d7c7fb9eb86)

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
2024-07-31 18:24:58 +00:00
IceSentry 011f71a245 Update ui_material example to be a slider instead (#14031)
# Objective

- Some people have asked how to do image masking in UI. It's pretty easy
to do using a `UiMaterial` assuming you know how to write shaders.

## Solution

- Update the ui_material example to show the bevy banner slowly being
revealed like a progress bar

## Notes

I'm not entirely sure if we want this or not. For people that would be
comfortable to use this for their own games they would probably have
already figured out how to do it and for people that aren't familiar
with shaders this isn't really enough to make an actual slider/progress
bar.

---------

Co-authored-by: François Mockers <francois.mockers@vleue.com>
2024-06-27 21:23:04 +00:00
Patrick Walton 44db8b7fac Allow phase items not associated with meshes to be binned. (#14029)
As reported in #14004, many third-party plugins, such as Hanabi, enqueue
entities that don't have meshes into render phases. However, the
introduction of indirect mode added a dependency on mesh-specific data,
breaking this workflow. This is because GPU preprocessing requires that
the render phases manage indirect draw parameters, which don't apply to
objects that aren't meshes. The existing code skips over binned entities
that don't have indirect draw parameters, which causes the rendering to
be skipped for such objects.

To support this workflow, this commit adds a new field,
`non_mesh_items`, to `BinnedRenderPhase`. This field contains a simple
list of (bin key, entity) pairs. After drawing batchable and unbatchable
objects, the non-mesh items are drawn one after another. Bevy itself
doesn't enqueue any items into this list; it exists solely for the
application and/or plugins to use.

Additionally, this commit switches the asset ID in the standard bin keys
to be an untyped asset ID rather than that of a mesh. This allows more
flexibility, allowing bins to be keyed off any type of asset.

This patch adds a new example, `custom_phase_item`, which simultaneously
serves to demonstrate how to use this new feature and to act as a
regression test so this doesn't break again.

Fixes #14004.

## Changelog

### Added

* `BinnedRenderPhase` now contains a `non_mesh_items` field for plugins
to add custom items to.
2024-06-27 16:13:03 +00:00
JMS55 c50a4d8821 Remove unused mip_bias parameter from apply_normal_mapping (#13752)
Mip bias is no longer used here
2024-06-10 13:00:34 +00:00