concepts.vision.fm_match.diff3f.extractor_diff3f.get_features_per_vertex#

get_features_per_vertex(mesh, device=None, diffusion_pipeline=None, dino_model=None, *, mesh_vertices=None, prompt=None, prompts_list=None, num_views=100, H=512, W=512, use_latent=False, use_normal_map=True, use_ball_query=True, ball_query_radius_factor=0.01, num_images_per_prompt=1, return_image=True, verbose=False)[source]#

Extract features per vertex from a mesh using a diffusion model and a DINO model. This function has three steps:

Render the mesh from multiple views.
Extract features from the rendered images.
- Use a diffusion model to add textures to the rendered images.
- Use a DINO model to extract features from the rendered images.
- Combine the features from the diffusion model and the DINO model for each pixel.
- Map the features back to the vertices of the mesh using a ball query or the nearest neighbor (depending on use_ball_query).
Aggregate the features per vertex across the rendered views.

Parameters:

mesh (MeshContainer | pytorch3d.structures.Meshes) – the mesh from which features will be extracted.
device (str | None) – the device to use for computation. defaults to ‘cuda’ if available, otherwise ‘cpu’.
diffusion_pipeline (Module | None) – the diffusion model used to generate features.
dino_model (Module | None) – the DINO model used to extract features.
mesh_vertices (Tensor | None) – the vertices where features will be extracted. if not provided, the mesh vertices will be used.
prompt (str | None) – the prompt used to generate texture completions for the diffusion pipeline.
prompts_list (List[str] | None) – the list of prompts used to generate texture completions for the diffusion pipeline.
num_views (int) – the number of views to use for feature extraction.
H (int) – the height of the rendered images.
W (int) – the width of the rendered images.
use_latent (bool) – whether to use latent diffusion in the diffusion pipeline (not implemented yet).
use_normal_map (bool) – whether to use normal maps in the diffusion pipeline (not implemented yet).
use_ball_query (bool) – whether to use ball queries to map features back to the vertices of the mesh.
ball_query_radius_factor (float) – the radius of the ball query. The radius is computed as a f * maximal_distance_between_vertices.
num_images_per_prompt (int) – the number of images to generate per prompt.
return_image (bool) – whether to return the generated images.
verbose (bool) – whether to print verbose output such as the number of missing features and the runtime.