User Tools

Site Tools


fork_synthetic_data_generator

This is an old revision of the document!


Fork Synthetic Data Generator

In Section Manual, instructions on how to start working with this module are given. Next, in Section Code Structure, the general structure of the module is described. Detailed documentation of individual functions can be found in corresponding files. Finally, in Section WIP (Work-In-Progress), issues currently worked on are listed.

Manual

  1. Select dataset to work on:
    1. Click “Generate dataset” icon to create a new synthetic dataset. Specify required parameters in the dialog shown and click “OK” to proceed.
    2. Alternatively, go to “File → Open Datafile…” and select a dataset you'd like to modify.
  2. Select plot dimensionality:
    1. In the top left corner, enable/disable the given switch to display a 3-dimensional scatter plot.
  3. Explore dataset:
    1. To see hidden areas of the dataset, click the “move” icon in the top right corner of the plot and use the appeared 2D-Slider to move the plot or rotate the cube.
    2. Additionally, you can use RMB to either move the plot or rotate the cube.
    3. To view other dimensions, enable Inspect Dimensions icon and;
    4. In 2D-Mode:
      1. To view different dimensions of the dataset, use a 2D Slider in “Visualize Dimensions”-GroupBox in the right part of the screen. Additional options: “Live preview” to instantly show dimensions you are currently going through. “Rescale” will set all axis of all dimensions to the same range.
      2. Use mouse's wheel to zoom in/out on the dataset.
    5. In 3D-Mode:
      1. To view different dimensions of the dataset, use 3 sliders in “Visualize Dimensions”-GroupBox in the right part of the screen. Color boxes on the left denote the axis of the cube. “Live preview” instantly shows dimensions you are going through.
  4. Select datapoints:
    1. In 2D-Mode:
      1. Select desired selection mode in by enabling the corresponding icon.
      2. “Area Selection Tool”: press LMB and drag it to draw selection rectangle. All points inside of this rectangle are selected.
      3. “Point Selection Tool”: click a datapoint using LMB.
      4. “Class selection Tool”: click a datapoint with required class using LMB.
      5. Clear selection: click somewhere on the plot.
  5. Interact with dataset:
    1. In 2D-Mode:
      1. Select desired interaction mode by enabling the corresponding icon/
      2. “Drag'n'drop Tool”: when this option is selected, use LMB to drag selected points.
      3. “Assign Class Tool”: when this option is selected, select new class using the slider shown. Slide all the way to the right to assign a completely new class. Afterwards click “Assign Class” button to perform the action.
      4. “Magnet Tool”: when this option is selected, selected points are influenced by the magnet defined at some point. Specify magnet's position with LMB. Check “Detract” option to make points move away from magnet. Check “Reverse force” to increase magnet's force for further points and decrease for closer ones. Use slider to increase magnet's force.
      5. “Strokes Tool”: when this option is selected, selected points will be resampled according to the specified pattern. Draw stroke using LMB. In tab “Draw”, such parameters like stroke's size (pen's size), hardness (difference between stroke's core and border), and intensity (stroke's transparence). You can also import existing images and use them as strokes. Switch to “Save/Load” tab. Click “Import stroke” button and select an image. Selected points will be resampled as soon as the image is selected. In order to save the current selection as a stroke, click “Save selection as stroke”, specify stroke's name in the displayed dialog.
    2. In 3D-Mode:
      1. Enable “Edit Planes Tool” and use LMB to embed 2D-planes in 3D-cube.
      2. To select a plane, click on plane's name in a list view.
      3. To save currently drawn plane, use “Save embedded plane”; to edit a plane, use “Edit plane”-button; to remove a plane, use “Delete plane”-button.
      4. To adjust depth of the currently drawn plane, use mouse-wheel.
      5. When “Edit plane” is clicked, a view to draw strokes is opened. Then, the interaction is done similarly to strokes interaction in 2D mode: use LMB to draw stroke, click “Resample” to resample the stroke or “Clear” to remove current drawing and return back to 3D mode.
      6. Additional option for strokes in 3D: scattering. Describes how strongly the points will be scattered from the original plane. Currently, constant scattering is supported.
      7. Additional option for strokes in 3D: choose the color of a stroke to define which points will be influenced. Since no selection mode is supported in 3D, this is the only way to define which points should be resampled.
  6. Modify dataset's metadata
    1. Enable “Dataset Details”-icons and modify names of dataset's classes (“Classes” tab) and dimensions (“Dimensions” tab) in the tool options.

Code Structure

Layout

QML_For_Module_SyntheticMultidimensionalDataGenerator.qml

Layout (itemforModuleSyntheticMultidimensionalDataGeneratorID) consists of 2 main parts: menu that provides overview of available tools and visualization area where a scatter plot of the dataset is displayed.

  1. Menu (recMenu): this part of the layout consists of another 3 sub-areas: (recDimensionalitySwitch) that contains an element that allows to switch between plot dimensionalities (2d/3d); (recMenuTools) that contains a number of icons that call certain functions (e.g. generate dataset) or set the interaction mode and determine how a mouse will work (e.g. Choose class); (recMenuToolsOptions) if a certain tool requires some additional parameters to be specified, they will appear in this area (e.g. magnet's power, stroke's size etc)
  2. Visualization area (recVisualizationArea): large area in the left and middle part of the screen is used to display a scatter plot of selected dimensions of the generated dataset.

Front-End

IUI/syntheticMultidimensionalDataGenerator.h(.cpp)

Module to synthetic data generation that supports (1) generation/import, (2) visualization, (3) modification, and (4) export of dataset.

  1. Generation/import: dataset is generated in generateDataset() that fills IMatrix object with uniform distributed random numbers. When updateModule() is called and an existing dataset was imported, importDataset() method is called, which converts OriginalDataset obtained from import to the internal representation of dataset, namely IMatrix object and converts IMetadata to a number of vectors which contain individual metadata attributes for all dimensions.
  2. Visualization: dataset is visualized using QPainter functionality combined with QCustomPlot. In 2-dimensional plot, QCustomPlot is used to visualize and update x- and y-axis, scales and grid. The 2D-plot consists of 4 layers: Axis-Grid Layer, Dataset Layer, Selection Layer, and Interaction Layer. Each layer is represented by QPixmap and updated when mVisualizationMode is set to a respective value. Each layer is updated separately to reduce the total plotting and update time. Each layer is updated in update[Name]Layer()-method. 3-dimensional plot is implemented using purely QPainter-functionality. All models (axis, cubes, planes, datapoints) are 3D-objects, which are then projected onto xy-plane and rendered according to the z-value. This plot consists of a single layer in order to easily determine the rendering order in one place.
  3. Modification of the dataset: dataset modifications happen when mouse events are triggered: mousePressEvent, mouseMoveEvent, mouseReleaseEvent, wheelEvent. General idea: on mousePressEvent, required interaction parameters are initialized (e.g. magnet's center, starting point of the translation etc). On mouseMoveEvent, these parameters (e.g. magnet's center is moved, selection rectangle is being drawn) and dataset itself are updated (e.g. on drag'n'drop, all selected points are moved using translation vector). On mouseReleaseEvent, parameters are set to default, and dataset is updated for the final time. Corresponding scatter plot is updated 50 times per second, following the strategy: on mousePressEvent, a timer is started which is triggered 50 times a second. On timeout(), update() method of the module, inherited from QQuickPaintedItem is called. On mouseReleaseEvent, timer is stopped and connection to update()-method is closed.
  4. Export: in exportDataset() an empty IMetadata object is created and filled with relevant data for all dimensions. This object, as well as dataset in form of IMatrix is passed to IConverterSynth::convert() method which converts these two input parameters to the suitable dataset representation.

Back-End

IFile/IConverterSynth.h(.cpp)

This class extends convert() method which now takes IMetadata and IMatrix object as input and saves these object in correct representation (dataset is stored as .data file, metadata is stored as .data.json).

Icon Credits

  1. change_class_*_icon: Icon made by Freepik from www.flaticon.com
  2. choose_dimension_*_icon: Icon made by Smashicons from www.flaticon.com
  3. dataset_details_*_icon: Icon made by Freepik from www.flaticon.com
  4. drag_n_drop_*_icon: Icon made by Pixel perfect from www.flaticon.com
  5. exploration_tool_*_icon: Icon made by Pixel Buddha from www.flaticon.com
  6. export_dataset_*_icon: Icon made by Smashicons from www.flaticon.com
  7. generate_dataset_icon:Icon made by Smashicons from www.flaticon.com
  8. information_icon: Icon made by Smashicons from www.flaticon.com
  9. magnet_*_icon: Icon made by Smashicons from www.flaticon.com
  10. strokes_*_icon: Icon made by Smashicons from www.flaticon.com
  11. switch_*: Icon made by Smashicons from www.flaticon.com
  12. tool_options_*: Icon made by Gregor Cresnar from www.flaticon.com
  13. tools_icon: Icon made by Smashicons from www.flaticon.com
fork_synthetic_data_generator.1542968150.txt.gz · Last modified: 2018/11/23 11:15 by 141.44.233.148