Home
Softono
p

poloclub

Professional software vendor delivering innovative solutions on the Softono platform. Specialized in both open-source and proprietary software development.

Total Products
4

Software by poloclub

transformer-explainer
Open Source

transformer-explainer

# Transformer Explainer: Interactive Learning of Text-Generative Models Transformer Explainer is an interactive visualization tool designed to help anyone learn how Transformer-based models like GPT work. It runs a live GPT-2 model right in your browser, allowing you to experiment with your own text and observe in real time how internal components and operations of the Transformer work together to predict the next tokens. Try Transformer Explainer at http://poloclub.github.io/transformer-explainer and watch a demo video on YouTube https://youtu.be/TFUc41G2ikY.<br/><br/> [![MIT license](http://img.shields.io/badge/license-MIT-brightgreen.svg)](http://opensource.org/licenses/MIT) [![arxiv badge](https://img.shields.io/badge/arXiv-2408.04619-red)](https://arxiv.org/abs/2408.04619) <a href="https://youtu.be/TFUc41G2ikY" target="_blank"><img width="100%" src='https://github.com/user-attachments/assets/0a4d8888-6555-4df5-bc71-77f1299115c3'></a> ## Live Demo Try Transformer Explainer: http://poloclub.github.io/transformer-explainer ## Research Paper [**Transformer Explainer: Learning LLM Transformers with Interactive Visual Explanation and Experimentations**](https://dl.acm.org/doi/pdf/10.1145/3772318.3791725). Aeree Cho, Grace C. Kim, Alexander Karpekov, Seongmin Lee, Alec Helbling, Benjamin Hoover, Zijie J. Wang, Minsuk Kahng, Duen Horng Chau. _Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems._ ## How to run locally #### Prerequisites - Node.js v20 or higher - NPM v10 or higher #### Steps ```bash git clone https://github.com/poloclub/transformer-explainer.git cd transformer-explainer npm install npm run dev ``` Then, on your web browser, access http://localhost:5173. ## Credits Transformer Explainer was created by <a href="https://aereeeee.github.io/" target="_blank">Aeree Cho</a>, <a href="https://www.linkedin.com/in/chaeyeonggracekim/" target="_blank">Grace C. Kim</a>, <a href="https://alexkarpekov.com/" target="_blank">Alexander Karpekov</a>, <a href="https://alechelbling.com/" target="_blank">Alec Helbling</a>, <a href="https://zijie.wang/" target="_blank">Jay Wang</a>, <a href="https://seongmin.xyz/" target="_blank">Seongmin Lee</a>, <a href="https://bhoov.com/" target="_blank">Benjamin Hoover</a>, and <a href="https://poloclub.github.io/polochau/" target="_blank">Polo Chau</a> at the Georgia Institute of Technology. ## Citation ```bibTeX @inproceedings{cho2026transformer, title={Transformer Explainer: Learning LLM Transformers with Interactive Visual Explanation and Experimentation}, author={Cho, Aeree and Kim, Grace C and Karpekov, Alexander and Lee, Seongmin and Helbling, Alec and Hoover, Benjamin and Wang, Zijie J and Kahng, Minsuk and Chau, Duen Horng}, booktitle={Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems}, pages={1--21}, year={2026} } ``` ## License The software is available under the [MIT License](https://github.com/poloclub/transformer-explainer/blob/main/LICENSE). ## Contact If you have any questions, feel free to [open an issue](https://github.com/poloclub/transformer-explainer/issues/new/choose) or contact [Aeree Cho](https://aereeeee.github.io/) or any of the contributors listed above. ## More AI explainers to check out - [**Diffusion Explainer**](https://poloclub.github.io/diffusion-explainer) for learning how Stable Diffusion transforms text prompt into image - [**CNN Explainer**](https://poloclub.github.io/cnn-explainer) - [**GAN Lab**](https://poloclub.github.io/ganlab) for playing with Generative Adversarial Networks in browser

Education & Learning ML Frameworks
7.8K Github Stars
cnn-explainer
Open Source

cnn-explainer

# CNN Explainer An interactive visualization system designed to help non-experts learn about Convolutional Neural Networks (CNNs) [![build](https://github.com/poloclub/cnn-explainer/workflows/build/badge.svg)](https://github.com/poloclub/cnn-explainer/actions) [![arxiv badge](https://img.shields.io/badge/arXiv-2004.15004-red)](http://arxiv.org/abs/2004.15004) [![DOI:10.1109/TVCG.2020.3030418](https://img.shields.io/badge/DOI-10.1109/TVCG.2020.3030418-blue)](https://doi.org/10.1109/TVCG.2020.3030418) <a href="https://youtu.be/HnWIHWFbuUQ" target="_blank"><img src="https://i.imgur.com/sCsudVg.png" style="max-width:100%;"></a> For more information, check out our manuscript: [**CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization**](https://arxiv.org/abs/2004.15004). Wang, Zijie J., Robert Turko, Omar Shaikh, Haekyu Park, Nilaksh Das, Fred Hohman, Minsuk Kahng, and Duen Horng Chau. *IEEE Transactions on Visualization and Computer Graphics (TVCG), 2020.* ## Live Demo For a live demo, visit: http://poloclub.github.io/cnn-explainer/ ## Running Locally Clone or download this repository: ```bash git clone [email protected]:poloclub/cnn-explainer.git # use degit if you don't want to download commit histories degit poloclub/cnn-explainer ``` Install the dependencies: ```bash npm install ``` Then run CNN Explainer: ```bash npm run dev ``` Navigate to [localhost:3000](https://localhost:3000). You should see CNN Explainer running in your broswer :) To see how we trained the CNN, visit the directory [`./tiny-vgg/`](tiny-vgg). If you want to use CNN Explainer with your own CNN model or image classes, see [#8](/../../issues/8) and [#14](/../../issues/14). ## Credits CNN Explainer was created by <a href="https://zijie.wang/">Jay Wang</a>, <a href="https://www.linkedin.com/in/robert-turko/">Robert Turko</a>, <a href="http://oshaikh.com/">Omar Shaikh</a>, <a href="https://haekyu.com/">Haekyu Park</a>, <a href="http://nilakshdas.com/">Nilaksh Das</a>, <a href="https://fredhohman.com/">Fred Hohman</a>, <a href="http://minsuk.com">Minsuk Kahng</a>, and <a href="https://www.cc.gatech.edu/~dchau/">Polo Chau</a>, which was the result of a research collaboration between Georgia Tech and Oregon State. We thank [Anmol Chhabria](https://www.linkedin.com/in/anmolchhabria), [Kaan Sancak](https://kaansancak.com), [Kantwon Rogers](https://www.kantwon.com), and the [Georgia Tech Visualization Lab](http://vis.gatech.edu) for their support and constructive feedback. ## Citation ```bibTeX @article{wangCNNExplainerLearning2020, title = {{{CNN Explainer}}: {{Learning Convolutional Neural Networks}} with {{Interactive Visualization}}}, shorttitle = {{{CNN Explainer}}}, author = {Wang, Zijie J. and Turko, Robert and Shaikh, Omar and Park, Haekyu and Das, Nilaksh and Hohman, Fred and Kahng, Minsuk and Chau, Duen Horng}, journal={IEEE Transactions on Visualization and Computer Graphics (TVCG)}, year={2020}, publisher={IEEE} } ``` ## License The software is available under the [MIT License](https://github.com/poloclub/cnn-explainer/blob/master/LICENSE). ## Contact If you have any questions, feel free to [open an issue](https://github.com/poloclub/cnn-explainer/issues/new/choose) or contact [Jay Wang](https://zijie.wang).

ML Frameworks Data Visualisation
9K Github Stars
diffusiondb
Open Source

diffusiondb

# DiffusionDB <a href="https://huggingface.co/datasets/poloclub/diffusiondb"><picture><source media="(prefers-color-scheme: dark)" srcset="https://i.imgur.com/yGxUUlX.png"><img src="favicon.ico" align="right" src="favicon.ico" height="40px"></picture> [![hugging](https://img.shields.io/badge/πŸ€—%20Hugging%20Face-Datasets-yellow)](https://huggingface.co/datasets/poloclub/diffusiondb) [![license](https://img.shields.io/badge/License-CC0/MIT-blue)](#licensing) [![arxiv badge](https://img.shields.io/badge/arXiv-2210.14896-red)](https://arxiv.org/abs/2210.14896) [![datasheet](https://img.shields.io/badge/Data%20Sheet-Available-success)](https://poloclub.github.io/diffusiondb/datasheet.html) <!-- [![DOI:10.1145/3491101.3519653](https://img.shields.io/badge/DOI-10.1145/3491101.3519653-blue)](https://doi.org/10.1145/3491101.3519653) --> <img width="100%" src="https://user-images.githubusercontent.com/15007159/201762588-f24db2b8-dbb2-4a94-947b-7de393fc3d33.gif"> DiffusionDB is the first large-scale text-to-image prompt dataset. It contains **14 million images** generated by Stable Diffusion using prompts and hyperparameters specified by real users. The unprecedented scale and diversity of this human-actuated dataset provide exciting research opportunities in understanding the interplay between prompts and generative models, detecting deepfakes, and designing human-AI interaction tools to help users more easily use these models. ## Get Started DiffusionDB is available at [πŸ€— Hugging Face Datasets](https://huggingface.co/datasets/poloclub/diffusiondb). ## Two Subsets DiffusionDB provides two subsets (DiffusionDB 2M and DiffusionDB Large) to support different needs. |Subset|Num of Images|Num of Unique Prompts|Size|Image Directory|Metadata Table| |:--|--:|--:|--:|--:|--:| |DiffusionDB 2M|2M|1.5M|1.6TB|`images/`|`metadata.parquet`| |DiffusionDB Large|14M|1.8M|6.5TB|`diffusiondb-large-part-1/` `diffusiondb-large-part-2/`|`metadata-large.parquet`| ##### Key Differences 1. Two subsets have a similar number of unique prompts, but DiffusionDB Large has much more images. DiffusionDB Large is a superset of DiffusionDB 2M. 2. Images in DiffusionDB 2M are stored in `png` format; images in DiffusionDB Large use a lossless `webp` format. ## Dataset Structure We use a modularized file structure to distribute DiffusionDB. The 2 million images in DiffusionDB 2M are split into 2,000 folders, where each folder contains 1,000 images and a JSON file that links these 1,000 images to their prompts and hyperparameters. Similarly, the 14 million images in DiffusionDB Large are split into 14,000 folders. ```bash # DiffusionDB 2M ./ β”œβ”€β”€ images β”‚Β Β  β”œβ”€β”€ part-000001 β”‚Β Β  β”‚Β Β  β”œβ”€β”€ 3bfcd9cf-26ea-4303-bbe1-b095853f5360.png β”‚Β Β  β”‚Β Β  β”œβ”€β”€ 5f47c66c-51d4-4f2c-a872-a68518f44adb.png β”‚Β Β  β”‚Β Β  β”œβ”€β”€ 66b428b9-55dc-4907-b116-55aaa887de30.png β”‚Β Β  β”‚Β Β  β”œβ”€β”€ [...] β”‚Β Β  β”‚Β Β  └── part-000001.json β”‚Β Β  β”œβ”€β”€ part-000002 β”‚Β Β  β”œβ”€β”€ part-000003 β”‚Β Β  β”œβ”€β”€ [...] β”‚Β Β  └── part-002000 └── metadata.parquet ``` ```bash # DiffusionDB Large ./ β”œβ”€β”€ diffusiondb-large-part-1 β”‚Β Β  β”œβ”€β”€ part-000001 β”‚Β Β  β”‚Β Β  β”œβ”€β”€ 0a8dc864-1616-4961-ac18-3fcdf76d3b08.webp β”‚Β Β  β”‚Β Β  β”œβ”€β”€ 0a25cacb-5d91-4f27-b18a-bd423762f811.webp β”‚Β Β  β”‚Β Β  β”œβ”€β”€ 0a52d584-4211-43a0-99ef-f5640ee2fc8c.webp β”‚Β Β  β”‚Β Β  β”œβ”€β”€ [...] β”‚Β Β  β”‚Β Β  └── part-000001.json β”‚Β Β  β”œβ”€β”€ part-000002 β”‚Β Β  β”œβ”€β”€ part-000003 β”‚Β Β  β”œβ”€β”€ [...] β”‚Β Β  └── part-010000 β”œβ”€β”€ diffusiondb-large-part-2 β”‚Β Β  β”œβ”€β”€ part-010001 β”‚Β Β  β”‚Β Β  β”œβ”€β”€ 0a68f671-3776-424c-91b6-c09a0dd6fc2d.webp β”‚Β Β  β”‚Β Β  β”œβ”€β”€ 0a0756e9-1249-4fe2-a21a-12c43656c7a3.webp β”‚Β Β  β”‚Β Β  β”œβ”€β”€ 0aa48f3d-f2d9-40a8-a800-c2c651ebba06.webp β”‚Β Β  β”‚Β Β  β”œβ”€β”€ [...] β”‚Β Β  β”‚Β Β  └── part-010001.json β”‚Β Β  β”œβ”€β”€ part-010002 β”‚Β Β  β”œβ”€β”€ part-010003 β”‚Β Β  β”œβ”€β”€ [...] β”‚Β Β  └── part-014000 └── metadata-large.parquet ``` These sub-folders have names `part-0xxxxx`, and each image has a unique name generated by [UUID Version 4](https://en.wikipedia.org/wiki/Universally_unique_identifier). The JSON file in a sub-folder has the same name as the sub-folder. Each image is a `PNG` file (DiffusionDB 2M) or a lossless `WebP` file (DiffusionDB Large). The JSON file contains key-value pairs mapping image filenames to their prompts and hyperparameters. For example, below is the image of `f3501e05-aef7-4225-a9e9-f516527408ac.png` and its key-value pair in `part-000001.json`. <img width="300" src="https://i.imgur.com/gqWcRs2.png"> ```json { "f3501e05-aef7-4225-a9e9-f516527408ac.png": { "p": "geodesic landscape, john chamberlain, christopher balaskas, tadao ando, 4 k, ", "se": 38753269, "c": 12.0, "st": 50, "sa": "k_lms" }, } ``` The data fields are: - key: Unique image name - `p`: Prompt - `se`: Random seed - `c`: CFG Scale (guidance scale) - `st`: Steps - `sa`: Sampler ## Dataset Metadata To help you easily access prompts and other attributes of images without downloading all the Zip files, we include two metadata tables `metadata.parquet` and `metadata-large.parquet` for DiffusionDB 2M and DiffusionDB Large, respectively. The shape of `metadata.parquet` is (2000000, 13) and the shape of `metatable-large.parquet` is (14000000, 13). Two tables share the same schema, and each row represents an image. We store these tables in the Parquet format because Parquet is column-based: you can efficiently query individual columns (e.g., prompts) without reading the entire table. Below are three random rows from `metadata.parquet`. | image_name | prompt | part_id | seed | step | cfg | sampler | width | height | user_name | timestamp | image_nsfw | prompt_nsfw | |:-----------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------:|-----------:|-------:|------:|----------:|--------:|---------:|:-----------------------------------------------------------------|:--------------------------|-------------:|--------------:| | 0c46f719-1679-4c64-9ba9-f181e0eae811.png | a small liquid sculpture, corvette, viscous, reflective, digital art | 1050 | 2026845913 | 50 | 7 | 8 | 512 | 512 | c2f288a2ba9df65c38386ffaaf7749106fed29311835b63d578405db9dbcafdb | 2022-08-11 09:05:00+00:00 | 0.0845108 | 0.00383462 | | a00bdeaa-14eb-4f6c-a303-97732177eae9.png | human sculpture of lanky tall alien on a romantic date at italian restaurant with smiling woman, nice restaurant, photography, bokeh | 905 | 1183522603 | 50 | 10 | 8 | 512 | 768 | df778e253e6d32168eb22279a9776b3cde107cc82da05517dd6d114724918651 | 2022-08-19 17:55:00+00:00 | 0.692934 | 0.109437 | | 6e5024ce-65ed-47f3-b296-edb2813e3c5b.png | portrait of barbaric spanish conquistador, symmetrical, by yoichi hatakenaka, studio ghibli and dan mumford | 286 | 1713292358 | 50 | 7 | 8 | 512 | 640 | 1c2e93cfb1430adbd956be9c690705fe295cbee7d9ac12de1953ce5e76d89906 | 2022-08-12 03:26:00+00:00 | 0.0773138 | 0.0249675 | ### Metadata Schema `metadata.parquet` and `metatable-large.parquet` share the same schema. |Column|Type|Description| |:---|:---|:---| |`image_name`|`string`|Image UUID filename.| |`prompt`|`string`|The text prompt used to generate this image.| |`part_id`|`uint16`|Folder ID of this image.| |`seed`|`uint32`| Random seed used to generate this image.| |`step`|`uint16`| Step count (hyperparameter).| |`cfg`|`float32`| Guidance scale (hyperparameter).| |`sampler`|`uint8`| Sampler method (hyperparameter). Mapping: `{1: "ddim", 2: "plms", 3: "k_euler", 4: "k_euler_ancestral", 5: "k_heun", 6: "k_dpm_2", 7: "k_dpm_2_ancestral", 8: "k_lms", 9: "others"}`. |`width`|`uint16`|Image width.| |`height`|`uint16`|Image height.| |`user_name`|`string`|The unique discord ID's SHA256 hash of the user who generated this image. For example, the hash for `xiaohk#3146` is `e285b7ef63be99e9107cecd79b280bde602f17e0ca8363cb7a0889b67f0b5ed0`. "deleted_account" refer to users who have deleted their accounts. None means the image has been deleted before we scrape it for the second time.| |`timestamp`|`timestamp`|UTC Timestamp when this image was generated. None means the image has been deleted before we scrape it for the second time. Note that timestamp is not accurate for duplicate images that have the same prompt, hypareparameters, width, height.| |`image_nsfw`|`float32`|Likelihood of an image being NSFW. Scores are predicted by [LAION's state-of-art NSFW detector](https://github.com/LAION-AI/LAION-SAFETY) (range from 0 to 1). A score of 2.0 means the image has already been flagged as NSFW and blurred by Stable Diffusion.| |`prompt_nsfw`|`float32`|Likelihood of a prompt being NSFW. Scores are predicted by the library [Detoxicy](https://github.com/unitaryai/detoxify). Each score represents the maximum of `toxicity` and `sexual_explicit` (range from 0 to 1).| > **Warning** > Although the Stable Diffusion model has an NSFW filter that automatically blurs user-generated NSFW images, this NSFW filter is not perfectβ€”DiffusionDB still contains some NSFW images. Therefore, we compute and provide the NSFW scores for images and prompts using the state-of-the-art models. The distribution of these scores is shown below. Please decide an appropriate NSFW score threshold to filter out NSFW images before using DiffusionDB in your projects. <picture> <source media="(prefers-color-scheme: dark)" srcset="https://i.imgur.com/KLbJkUr.png"> <img alt="NSFW Score distributions." src="https://i.imgur.com/1RiGAXL.png" width="100%"> </picture> ## Loading DiffusionDB DiffusionDB is large (1.6TB or 6.5 TB)! However, with our modularized file structure, you can easily load a desirable number of images and their prompts and hyperparameters. In the [`example-loading.ipynb`](https://github.com/poloclub/diffusiondb/blob/main/notebooks/example-loading.ipynb) notebook, we demonstrate three methods to load a subset of DiffusionDB. Below is a short summary. ### Method 1: Use Hugging Face Datasets Loader You can use the Hugging Face [`Datasets`](https://huggingface.co/docs/datasets/quickstart) library to easily load prompts and images from DiffusionDB. We pre-defined 16 DiffusionDB subsets (configurations) based on the number of instances. You can see all subsets in the [Dataset Preview](https://huggingface.co/datasets/poloclub/diffusiondb/viewer/all/train). > **Note** > To use Datasets Loader, you need to install `Pillow` as well (`pip install Pillow`) ```python import numpy as np from datasets import load_dataset # Load the dataset with the `large_random_1k` subset dataset = load_dataset('poloclub/diffusiondb', 'large_random_1k') ``` ### Method 2. Use a downloader script This repo includes a Python downloader [`download.py`](https://github.com/poloclub/diffusiondb/blob/main/scripts/download.py) that allows you to download and load DiffusionDB. You can use it from your command line. Below is an example of loading a subset of DiffusionDB. #### Usage/Examples The script is run using command-line arguments as follows: - `-i` `--index` - File to download or lower bound of a range of files if `-r` is also set. - `-r` `--range` - Upper bound of range of files to download if `-i` is set. - `-o` `--output` - Name of custom output directory. Defaults to the current directory if not set. - `-z` `--unzip` - Unzip the file/files after downloading - `-l` `--large` - Download from Diffusion DB Large. Defaults to Diffusion DB 2M. ##### Downloading a single file The specific file to download is supplied as the number at the end of the file on HuggingFace. The script will automatically pad the number out and generate the URL. ```bash python download.py -i 23 ``` ##### Downloading a range of files The upper and lower bounds of the set of files to download are set by the `-i` and `-r` flags respectively. ```bash python download.py -i 1 -r 2000 ``` Note that this range will download the entire dataset. The script will ask you to confirm that you have 1.7Tb free at the download destination. ##### Downloading to a specific directory The script will default to the location of the dataset's `part` .zip files at `images/`. If you wish to move the download location, you should move these files as well or use a symbolic link. ```bash python download.py -i 1 -r 2000 -o /home/$USER/datahoarding/etc ``` Again, the script will automatically add the `/` between the directory and the file when it downloads. ##### Setting the files to unzip once they've been downloaded The script is set to unzip the files _after_ all files have downloaded as both can be lengthy processes in certain circumstances. ```bash python download.py -i 1 -r 2000 -z ``` ### Method 3. Use `metadata.parquet` (Text Only) If your task does not require images, then you can easily access all 2 million prompts and hyperparameters in the `metadata.parquet` table. ```python from urllib.request import urlretrieve import pandas as pd # Download the parquet table table_url = f'https://huggingface.co/datasets/poloclub/diffusiondb/resolve/main/metadata.parquet' urlretrieve(table_url, 'metadata.parquet') # Read the table using Pandas metadata_df = pd.read_parquet('metadata.parquet') ``` ## Dataset Creation We collected all images from the official Stable Diffusion Discord server. Please read our [research paper](https://arxiv.org/abs/2210.14896) for details. The code is included in [`./scripts/`](./scripts/). ## Data Removal If you find any harmful images or prompts in DiffusionDB, you can use [this Google Form](https://forms.gle/GbYaSpRNYqxCafMZ9) to report them. Similarly, if you are a creator of an image included in this dataset, you can use the [same form](https://forms.gle/GbYaSpRNYqxCafMZ9) to let us know if you would like to remove your image from DiffusionDB. We will closely monitor this form and update DiffusionDB periodically. ## Credits DiffusionDB is created by [Jay Wang](https://zijie.wang), [Evan Montoya](https://www.linkedin.com/in/evan-montoya-b252391b4/), [David Munechika](https://www.linkedin.com/in/dmunechika/), [Alex Yang](https://alexanderyang.me), [Ben Hoover](https://www.bhoov.com), [Polo Chau](https://faculty.cc.gatech.edu/~dchau/). ## Citation ```bibtex @article{wangDiffusionDBLargescalePrompt2022, title = {{{DiffusionDB}}: {{A}} Large-Scale Prompt Gallery Dataset for Text-to-Image Generative Models}, author = {Wang, Zijie J. and Montoya, Evan and Munechika, David and Yang, Haoyang and Hoover, Benjamin and Chau, Duen Horng}, year = {2022}, journal = {arXiv:2210.14896 [cs]}, url = {https://arxiv.org/abs/2210.14896} } ``` ## Licensing The DiffusionDB dataset is available under the [CC0 1.0 License](https://creativecommons.org/publicdomain/zero/1.0/). The Python code in this repository is available under the [MIT License](./LICENSE). ## Contact If you have any questions, feel free to [open an issue](https://github.com/poloclub/diffusiondb/issues/new) or contact [Jay Wang](https://zijie.wang).

Data Labeling
1.4K Github Stars
diffusion-explainer
Open Source

diffusion-explainer

# Diffusion-Explainer Diffusion Explainer is an interactive visualization tool designed to help anyone learn how Stable Diffusion transforms text prompts into images. It runs in your browser, allowing you to experiment with several preset prompts without any installation, coding skills, or GPUs. Try Diffusion Explainer at https://poloclub.github.io/diffusion-explainer and watch a demo video on YouTube https://youtu.be/Zg4gxdIWDds! [![MIT license](http://img.shields.io/badge/license-MIT-brightgreen.svg)](http://opensource.org/licenses/MIT) [![arxiv badge](https://img.shields.io/badge/arXiv-2305.03509-red)](https://arxiv.org/abs/2305.03509) <!-- ![crown_jewel]() --> <table> <tr> <td colspan="4"><video width="100%" src="https://github.com/poloclub/diffusion-explainer/assets/43836461/72974e4c-0a5e-436f-b7a1-89de0500bce1"></td> </tr> <tr> <td><a href="http://poloclub.github.io/diffusion-explainer">πŸš€ Live Demo</a></td> <td><a href="https://youtu.be/Zg4gxdIWDds">πŸ“Ί Demo Video</a></td> <td><a href="https://arxiv.org/abs/2305.03509">πŸ“œ Research Paper</a></td> <td><a href="https://medium.com/@seongminleee/77b53f4f1c4">πŸ“„ Blog Post</a></td> </tr> </table> ### Research Paper [**Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion**](https://arxiv.org/abs/2305.03509). Seongmin Lee, Benjamin Hoover, Hendrik Strobelt, Zijie J. Wang, ShengYun Peng, Austin Wright, Kevin Li, Haekyu Park, Haoyang Yang, Duen Horng Chau. Short paper, IEEE VIS 2024. ## How to run locally ``` git clone https://github.com/poloclub/diffusion-explainer.git cd diffusion-explainer python -m http.server 8000 ``` Then, on your web browser, access http://localhost:8000. You can replace 8000 with other port numbers you want to use. ## Credits Led by [Seongmin Lee](http://www.seongmin.xyz), Diffusion Explainer is created by Machine Learning and Human-computer Interaction researchers at Georgia Tech and IBM Research. The team includes [Seongmin Lee](http://www.seongmin.xyz), [Benjamin Hoover](https://bhoov.com), [Hendrik Strobelt](http://hendrik.strobelt.com), [Jay Wang](https://zijie.wang), [ShengYun (Anthony) Peng](https://shengyun-peng.github.io), [Austin Wright](https://www.austinpwright.com), [Kevin Li](https://www.linkedin.com/in/kevinyli/), [Haekyu Park](https://haekyu.github.io/), [Alex Yang](https://alexanderyang.me/), and [Polo Chau](http://www.cc.gatech.edu/~dchau/). ## Citation ```bibTeX @article{lee2024diffusion, title = {{D}iffusion {E}xplainer: {V}isual {E}xplanation for {T}ext-to-image {S}table {D}iffusion}, shorttitle = {Diffusion Explainer}, author = {Lee, Seongmin and Hoover, Benjamin and Strobelt, Hendrik and Wang, Zijie J and Peng, ShengYun and Wright, Austin and Li, Kevin and Park, Haekyu and Yang, Haoyang and Chau, Duen Horng}, journal={IEEE VIS}, year={2024} } ``` ## License The software is available under the [MIT License](https://github.com/poloclub/diffusion-explainer/blob/main/LICENSE). ## Contact If you have any questions, feel free to [open an issue](https://github.com/poloclub/diffusion-explainer/issues/new/choose) or contact [Seongmin Lee](http://www.seongmin.xyz/).

Analytics & BI ML Frameworks
475 Github Stars