Update to include S3 support.

Refactor tonnes of code. Double buffering not yet fully tested. PSRAM support doesn't work at all - garbled mess.

Enable in platformIO using:
build_flags =
	-DSPIRAM_FRAMEBUFFER=1
This commit is contained in:
mrfaptastic 2022-09-30 03:17:19 +01:00
parent 86063fe594
commit ebe75dcaba
18 changed files with 1559 additions and 327 deletions

View file

@ -1,190 +1,40 @@
#include <Arduino.h>
#include "ESP32-HUB75-MatrixPanel-I2S-DMA.h"
#if defined(ESP32_SXXX)
#pragma message "Compiling for ESP32-Sx MCUs"
#elif defined(ESP32_CXXX)
#pragma message "Compiling for ESP32-Cx MCUs"
#elif CONFIG_IDF_TARGET_ESP32 || defined(ESP32)
#pragma message "Compiling for original (released 2016) 520kB SRAM ESP32."
#else
#error "Compiling for something unknown!"
#endif
// Credits: Louis Beaudoin <https://github.com/pixelmatix/SmartMatrix/tree/teensylc>
// and Sprite_TM: https://www.esp32.com/viewtopic.php?f=17&t=3188 and https://www.esp32.com/viewtopic.php?f=13&t=3256
/*
This is example code to driver a p3(2121)64*32 -style RGB LED display. These types of displays do not have memory and need to be refreshed
continuously. The display has 2 RGB inputs, 4 inputs to select the active line, a pixel clock input, a latch enable input and an output-enable
input. The display can be seen as 2 64x16 displays consisting of the upper half and the lower half of the display. Each half has a separate
RGB pixel input, the rest of the inputs are shared.
Each display half can only show one line of RGB pixels at a time: to do this, the RGB data for the line is input by setting the RGB input pins
to the desired value for the first pixel, giving the display a clock pulse, setting the RGB input pins to the desired value for the second pixel,
giving a clock pulse, etc. Do this 64 times to clock in an entire row. The pixels will not be displayed yet: until the latch input is made high,
the display will still send out the previously clocked in line. Pulsing the latch input high will replace the displayed data with the data just
clocked in.
The 4 line select inputs select where the currently active line is displayed: when provided with a binary number (0-15), the latched pixel data
will immediately appear on this line. Note: While clocking in data for a line, the *previous* line is still displayed, and these lines should
be set to the value to reflect the position the *previous* line is supposed to be on.
Finally, the screen has an OE input, which is used to disable the LEDs when latching new data and changing the state of the line select inputs:
doing so hides any artefacts that appear at this time. The OE line is also used to dim the display by only turning it on for a limited time every
line.
All in all, an image can be displayed by 'scanning' the display, say, 100 times per second. The slowness of the human eye hides the fact that
only one line is showed at a time, and the display looks like every pixel is driven at the same time.
Now, the RGB inputs for these types of displays are digital, meaning each red, green and blue subpixel can only be on or off. This leads to a
color palette of 8 pixels, not enough to display nice pictures. To get around this, we use binary code modulation.
Binary code modulation is somewhat like PWM, but easier to implement in our case. First, we define the time we would refresh the display without
binary code modulation as the 'frame time'. For, say, a four-bit binary code modulation, the frame time is divided into 15 ticks of equal length.
We also define 4 subframes (0 to 3), defining which LEDs are on and which LEDs are off during that subframe. (Subframes are the same as a
normal frame in non-binary-coded-modulation mode, but are showed faster.) From our (non-monochrome) input image, we take the (8-bit: bit 7
to bit 0) RGB pixel values. If the pixel values have bit 7 set, we turn the corresponding LED on in subframe 3. If they have bit 6 set,
we turn on the corresponding LED in subframe 2, if bit 5 is set subframe 1, if bit 4 is set in subframe 0.
Now, in order to (on average within a frame) turn a LED on for the time specified in the pixel value in the input data, we need to weigh the
subframes. We have 15 pixels: if we show subframe 3 for 8 of them, subframe 2 for 4 of them, subframe 1 for 2 of them and subframe 1 for 1 of
them, this 'automatically' happens. (We also distribute the subframes evenly over the ticks, which reduces flicker.)
In this code, we use the I2S peripheral in parallel mode to achieve this. Essentially, first we allocate memory for all subframes. This memory
contains a sequence of all the signals (2xRGB, line select, latch enable, output enable) that need to be sent to the display for that subframe.
Then we ask the I2S-parallel driver to set up a DMA chain so the subframes are sent out in a sequence that satisfies the requirement that
subframe x has to be sent out for (2^x) ticks. Finally, we fill the subframes with image data.
We use a front buffer/back buffer technique here to make sure the display is refreshed in one go and drawing artefacts do not reach the display.
In practice, for small displays this is not really necessarily.
*/
// macro's to calculate sizes of a single buffer (double buffer takes twice as this)
#define rowBitStructBuffSize sizeof(ESP32_I2S_DMA_STORAGE_TYPE) * (PIXELS_PER_ROW + CLKS_DURING_LATCH) * PIXEL_COLOR_DEPTH_BITS
#define frameStructBuffSize ROWS_PER_FRAME * rowBitStructBuffSize
static const char* TAG = "MatrixPanel";
/* this replicates same function in rowBitStruct, but due to induced inlining it might be MUCH faster when used in tight loops
* while method from struct could be flushed out of instruction cache between loop cycles
* do NOT forget about buff_id param if using this
*
* faptastic note oct22: struct call is not inlined... commenting out this additional compile declaration
*/
#define getRowDataPtr(row, _dpth, buff_id) &(dma_buff.rowBits[row]->data[_dpth * dma_buff.rowBits[row]->width + buff_id*(dma_buff.rowBits[row]->width * dma_buff.rowBits[row]->color_depth)])
//#define getRowDataPtr(row, _dpth, buff_id) &(dma_buff.rowBits[row]->data[_dpth * dma_buff.rowBits[row]->width + buff_id*(dma_buff.rowBits[row]->width * dma_buff.rowBits[row]->color_depth)])
bool MatrixPanel_I2S_DMA::allocateDMAmemory()
{
/***
* Step 1: Look at the overall DMA capable memory for the DMA FRAMEBUFFER data only (not the DMA linked list descriptors yet)
* and do some pre-checks.
*/
int _num_frame_buffers = (m_cfg.double_buff) ? 2:1;
size_t _frame_buffer_memory_required = frameStructBuffSize * _num_frame_buffers;
size_t _dma_linked_list_memory_required = 0;
size_t _total_dma_capable_memory_reserved = 0;
// 1. Calculate the amount of DMA capable memory that's actually available
#if SERIAL_DEBUG
Serial.printf_P(PSTR("Panel Width: %d pixels.\r\n"), PIXELS_PER_ROW);
Serial.printf_P(PSTR("Panel Height: %d pixels.\r\n"), m_cfg.mx_height);
if (m_cfg.double_buff) {
Serial.println(F("DOUBLE FRAME BUFFERS / DOUBLE BUFFERING IS ENABLED. DOUBLE THE RAM REQUIRED!"));
}
Serial.println(F("DMA memory blocks available before any malloc's: "));
heap_caps_print_heap_info(MALLOC_CAP_DMA);
Serial.println(F("******************************************************************"));
Serial.printf_P(PSTR("We're going to need %d bytes of SRAM just for the frame buffer(s).\r\n"), _frame_buffer_memory_required);
Serial.printf_P(PSTR("The total amount of DMA capable SRAM memory is %d bytes.\r\n"), heap_caps_get_free_size(MALLOC_CAP_DMA));
Serial.printf_P(PSTR("Largest DMA capable SRAM memory block is %d bytes.\r\n"), heap_caps_get_largest_free_block(MALLOC_CAP_DMA));
Serial.println(F("******************************************************************"));
#endif
// Can we potentially fit the framebuffer into the DMA capable memory that's available?
if ( heap_caps_get_free_size(MALLOC_CAP_DMA) < _frame_buffer_memory_required ) {
#if SERIAL_DEBUG
Serial.printf_P(PSTR("######### Insufficient memory for requested resolution. Reduce MATRIX_COLOR_DEPTH and try again.\r\n\tAdditional %d bytes of memory required.\r\n\r\n"), (_frame_buffer_memory_required-heap_caps_get_free_size(MALLOC_CAP_DMA)) );
#endif
return false;
}
// Alright, theoretically we should be OK, so let us do this, so
// lets allocate a chunk of memory for each row (a row could span multiple panels if chaining is in place)
// Alright, theoretically we should be OK, so let us do this, so
// lets allocate a chunk of memory for each row (a row could span multiple panels if chaining is in place)
dma_buff.rowBits.reserve(ROWS_PER_FRAME);
// iterate through number of rows
for (int malloc_num =0; malloc_num < ROWS_PER_FRAME; ++malloc_num)
{
auto ptr = std::make_shared<rowBitStruct>(PIXELS_PER_ROW, PIXEL_COLOR_DEPTH_BITS, m_cfg.double_buff);
auto ptr = std::make_shared<rowBitStruct>(PIXELS_PER_ROW, PIXEL_COLOR_DEPTH_BITS, m_cfg.double_buff);
if (ptr->data == nullptr){
#if SERIAL_DEBUG
Serial.printf_P(PSTR("ERROR: Couldn't malloc rowBitStruct %d! Critical fail.\r\n"), malloc_num);
#endif
return false;
// TODO: should we release all previous rowBitStructs here???
}
dma_buff.rowBits.emplace_back(ptr); // save new rowBitStruct into rows vector
++dma_buff.rows;
#if SERIAL_DEBUG
Serial.printf_P(PSTR("Malloc'ing %d bytes of memory @ address %ud for frame row %d.\r\n"), ptr->size()*_num_frame_buffers, (unsigned int)ptr->getDataPtr(), malloc_num);
#endif
}
_total_dma_capable_memory_reserved += _frame_buffer_memory_required;
/***
* Step 2: Calculate the amount of memory required for the DMA engine's linked list descriptors.
* Credit to SmartMatrix for this stuff.
*/
// Calculate what colour depth is actually possible based on memory available vs. required DMA linked-list descriptors.
// aka. Calculate the lowest LSBMSB_TRANSITION_BIT value that will fit in memory
int numDMAdescriptorsPerRow = 0;
lsbMsbTransitionBit = 0;
while(1) {
numDMAdescriptorsPerRow = 1;
for(int i=lsbMsbTransitionBit + 1; i<PIXEL_COLOR_DEPTH_BITS; i++) {
numDMAdescriptorsPerRow += (1<<(i - lsbMsbTransitionBit - 1));
if (ptr->data == nullptr)
{
#if SERIAL_DEBUG
Serial.printf_P(PSTR("ERROR: Couldn't malloc rowBitStruct %d! Critical fail.\r\n"), malloc_num);
#endif
return false;
// TODO: should we release all previous rowBitStructs here???
}
size_t ramrequired = numDMAdescriptorsPerRow * ROWS_PER_FRAME * _num_frame_buffers * sizeof(lldesc_t);
size_t largestblockfree = heap_caps_get_largest_free_block(MALLOC_CAP_DMA);
#if SERIAL_DEBUG
Serial.printf_P(PSTR("lsbMsbTransitionBit of %d with %d DMA descriptors per frame row, requires %d bytes RAM, %d available, leaving %d free: \r\n"), lsbMsbTransitionBit, numDMAdescriptorsPerRow, ramrequired, largestblockfree, largestblockfree - ramrequired);
#endif
if(ramrequired < largestblockfree)
break;
if(lsbMsbTransitionBit < PIXEL_COLOR_DEPTH_BITS - 1)
lsbMsbTransitionBit++;
else
break;
dma_buff.rowBits.emplace_back(ptr); // save new rowBitStruct into rows vector
++dma_buff.rows;
}
#if SERIAL_DEBUG
Serial.printf_P(PSTR("Raised lsbMsbTransitionBit to %d/%d to fit in remaining RAM\r\n"), lsbMsbTransitionBit, PIXEL_COLOR_DEPTH_BITS - 1);
#endif
#ifndef IGNORE_REFRESH_RATE
// calculate the lowest LSBMSB_TRANSITION_BIT value that will fit in memory that will meet or exceed the configured refresh rate
while(1) {
int psPerClock = 1000000000000UL/m_cfg.i2sspeed;
@ -214,17 +64,13 @@ bool MatrixPanel_I2S_DMA::allocateDMAmemory()
break;
}
#if SERIAL_DEBUG
Serial.printf_P(PSTR("Raised lsbMsbTransitionBit to %d/%d to meet minimum refresh rate\r\n"), lsbMsbTransitionBit, PIXEL_COLOR_DEPTH_BITS - 1);
#endif
#endif
/***
* Step 2a: lsbMsbTransition bit is now finalised - recalculate the DMA descriptor count required, which is used for
* memory allocation of the DMA linked list memory structure.
*/
numDMAdescriptorsPerRow = 1;
int numDMAdescriptorsPerRow = 1;
for(int i=lsbMsbTransitionBit + 1; i<PIXEL_COLOR_DEPTH_BITS; i++) {
numDMAdescriptorsPerRow += (1<<(i - lsbMsbTransitionBit - 1));
}
@ -234,7 +80,7 @@ bool MatrixPanel_I2S_DMA::allocateDMAmemory()
// Refer to 'DMA_LL_PAYLOAD_SPLIT' code in configureDMA() below to understand why this exists.
// numDMAdescriptorsPerRow is also used to calculate descount which is super important in i2s_parallel_config_t SoC DMA setup.
if ( rowBitStructBuffSize > DMA_MAX ) {
if ( dma_buff.rowBits[0]->size() > DMA_MAX ) {
#if SERIAL_DEBUG
Serial.printf_P(PSTR("rowColorDepthStruct struct is too large, split DMA payload required. Adding %d DMA descriptors\n"), PIXEL_COLOR_DEPTH_BITS-1);
@ -249,24 +95,12 @@ bool MatrixPanel_I2S_DMA::allocateDMAmemory()
* Step 3: Allocate memory for DMA linked list, linking up each framebuffer row in sequence for GPIO output.
*/
_dma_linked_list_memory_required = numDMAdescriptorsPerRow * ROWS_PER_FRAME * _num_frame_buffers * sizeof(lldesc_t);
#if SERIAL_DEBUG
Serial.printf_P(PSTR("Descriptors for lsbMsbTransitionBit of %d/%d with %d frame rows require %d bytes of DMA RAM with %d numDMAdescriptorsPerRow.\r\n"), lsbMsbTransitionBit, PIXEL_COLOR_DEPTH_BITS - 1, ROWS_PER_FRAME, _dma_linked_list_memory_required, numDMAdescriptorsPerRow);
#endif
_total_dma_capable_memory_reserved += _dma_linked_list_memory_required;
// Do a final check to see if we have enough space for the additional DMA linked list descriptors that will be required to link it all up!
if(_dma_linked_list_memory_required > heap_caps_get_largest_free_block(MALLOC_CAP_DMA)) {
#if SERIAL_DEBUG
Serial.println(F("ERROR: Not enough SRAM left over for DMA linked-list descriptor memory reservation! Oh so close!\r\n"));
#endif
return false;
} // linked list descriptors memory check
// malloc the DMA linked list descriptors that i2s_parallel will need
desccount = numDMAdescriptorsPerRow * ROWS_PER_FRAME;
dma_bus.allocate_dma_desc_memory(desccount);
/*
//lldesc_t * dmadesc_a = (lldesc_t *)heap_caps_malloc(desccount * sizeof(lldesc_t), MALLOC_CAP_DMA);
dmadesc_a = (lldesc_t *)heap_caps_malloc(desccount * sizeof(lldesc_t), MALLOC_CAP_DMA);
assert("Can't allocate descriptor framebuffer a");
@ -289,15 +123,7 @@ bool MatrixPanel_I2S_DMA::allocateDMAmemory()
return false;
}
}
#if SERIAL_DEBUG
Serial.println(F("*** ESP32-HUB75-MatrixPanel-I2S-DMA: Memory Allocations Complete ***"));
Serial.printf_P(PSTR("Total memory that was reserved: %d kB.\r\n"), _total_dma_capable_memory_reserved/1024);
Serial.printf_P(PSTR("... of which was used for the DMA Linked List(s): %d kB.\r\n"), _dma_linked_list_memory_required/1024);
Serial.printf_P(PSTR("Heap Memory Available: %d bytes total. Largest free block: %d bytes.\r\n"), heap_caps_get_free_size(0), heap_caps_get_largest_free_block(0));
Serial.printf_P(PSTR("General RAM Available: %d bytes total. Largest free block: %d bytes.\r\n"), heap_caps_get_free_size(MALLOC_CAP_DEFAULT), heap_caps_get_largest_free_block(MALLOC_CAP_DEFAULT));
#endif
*/
// Just os we know
initialized = true;
@ -310,57 +136,48 @@ bool MatrixPanel_I2S_DMA::allocateDMAmemory()
void MatrixPanel_I2S_DMA::configureDMA(const HUB75_I2S_CFG& _cfg)
{
#if SERIAL_DEBUG
Serial.println(F("configureDMA(): Starting configuration of DMA engine.\r\n"));
#endif
lldesc_t *previous_dmadesc_a = 0;
lldesc_t *previous_dmadesc_b = 0;
// lldesc_t *previous_dmadesc_a = 0;
// lldesc_t *previous_dmadesc_b = 0;
int current_dmadescriptor_offset = 0;
// HACK: If we need to split the payload in 1/2 so that it doesn't breach DMA_MAX, lets do it by the color_depth.
int num_dma_payload_color_depths = PIXEL_COLOR_DEPTH_BITS;
if ( rowBitStructBuffSize > DMA_MAX ) {
if ( dma_buff.rowBits[0]->size() > DMA_MAX ) {
num_dma_payload_color_depths = 1;
}
// Fill DMA linked lists for both frames (as in, halves of the HUB75 panel) and if double buffering is enabled, link it up for both buffers.
for(int row = 0; row < ROWS_PER_FRAME; row++) {
#if SERIAL_DEBUG
Serial.printf_P(PSTR( "Row %d DMA payload of %d bytes. DMA_MAX is %d.\n"), row, dma_buff.rowBits[row]->size(), DMA_MAX);
#endif
// first set of data is LSB through MSB, single pass (IF TOTAL SIZE < DMA_MAX) - all color bits are displayed once, which takes care of everything below and including LSBMSB_TRANSITION_BIT
// NOTE: size must be less than DMA_MAX - worst case for library: 16-bpp with 256 pixels per row would exceed this, need to break into two
link_dma_desc(&dmadesc_a[current_dmadescriptor_offset], previous_dmadesc_a, dma_buff.rowBits[row]->getDataPtr(), dma_buff.rowBits[row]->size(num_dma_payload_color_depths));
previous_dmadesc_a = &dmadesc_a[current_dmadescriptor_offset];
//link_dma_desc(&dmadesc_a[current_dmadescriptor_offset], previous_dmadesc_a, dma_buff.rowBits[row]->getDataPtr(), dma_buff.rowBits[row]->size(num_dma_payload_color_depths));
// previous_dmadesc_a = &dmadesc_a[current_dmadescriptor_offset];
dma_bus.create_dma_desc_link(dma_buff.rowBits[row]->getDataPtr(), dma_buff.rowBits[row]->size(num_dma_payload_color_depths));
if (m_cfg.double_buff) {
link_dma_desc(&dmadesc_b[current_dmadescriptor_offset], previous_dmadesc_b, dma_buff.rowBits[row]->getDataPtr(0, 1), dma_buff.rowBits[row]->size(num_dma_payload_color_depths));
previous_dmadesc_b = &dmadesc_b[current_dmadescriptor_offset]; }
dma_bus.create_dma_desc_link(dma_buff.rowBits[row]->getDataPtr(), dma_buff.rowBits[row]->size(num_dma_payload_color_depths), true);
//link_dma_desc(&dmadesc_b[current_dmadescriptor_offset], previous_dmadesc_b, dma_buff.rowBits[row]->getDataPtr(0, 1), dma_buff.rowBits[row]->size(num_dma_payload_color_depths));
//previous_dmadesc_b = &dmadesc_b[current_dmadescriptor_offset];
}
current_dmadescriptor_offset++;
// If the number of pixels per row is too great for the size of a DMA payload, so we need to split what we were going to send above.
if ( rowBitStructBuffSize > DMA_MAX )
if ( dma_buff.rowBits[0]->size() > DMA_MAX )
{
#if SERIAL_DEBUG
Serial.printf_P(PSTR("Splitting DMA payload for %d color depths into %d byte payloads.\r\n"), PIXEL_COLOR_DEPTH_BITS-1, rowBitStructBuffSize/PIXEL_COLOR_DEPTH_BITS );
#endif
for (int cd = 1; cd < PIXEL_COLOR_DEPTH_BITS; cd++)
{
// first set of data is LSB through MSB, single pass - all color bits are displayed once, which takes care of everything below and including LSBMSB_TRANSITION_BIT
// TODO: size must be less than DMA_MAX - worst case for library: 16-bpp with 256 pixels per row would exceed this, need to break into two
link_dma_desc(&dmadesc_a[current_dmadescriptor_offset], previous_dmadesc_a, dma_buff.rowBits[row]->getDataPtr(cd, 0), dma_buff.rowBits[row]->size(num_dma_payload_color_depths) );
previous_dmadesc_a = &dmadesc_a[current_dmadescriptor_offset];
dma_bus.create_dma_desc_link(dma_buff.rowBits[row]->getDataPtr(cd, 0), dma_buff.rowBits[row]->size(num_dma_payload_color_depths));
if (m_cfg.double_buff) {
link_dma_desc(&dmadesc_b[current_dmadescriptor_offset], previous_dmadesc_b, dma_buff.rowBits[row]->getDataPtr(cd, 1), dma_buff.rowBits[row]->size(num_dma_payload_color_depths));
previous_dmadesc_b = &dmadesc_b[current_dmadescriptor_offset]; }
dma_bus.create_dma_desc_link(dma_buff.rowBits[row]->getDataPtr(cd, 0), dma_buff.rowBits[row]->size(num_dma_payload_color_depths),true);
//link_dma_desc(&dmadesc_b[current_dmadescriptor_offset], previous_dmadesc_b, dma_buff.rowBits[row]->getDataPtr(cd, 1), dma_buff.rowBits[row]->size(num_dma_payload_color_depths));
//previous_dmadesc_b = &dmadesc_b[current_dmadescriptor_offset];
}
current_dmadescriptor_offset++;
@ -374,18 +191,18 @@ void MatrixPanel_I2S_DMA::configureDMA(const HUB75_I2S_CFG& _cfg)
// because we sweep through to MSB each time, it divides the number of times we have to sweep in half (saving linked list RAM)
// we need 2^(i - LSBMSB_TRANSITION_BIT - 1) == 1 << (i - LSBMSB_TRANSITION_BIT - 1) passes from i to MSB
#if SERIAL_DEBUG
Serial.printf_P(PSTR("configureDMA(): DMA Loops for PIXEL_COLOR_DEPTH_BITS %d is: %d.\r\n"), i, (1<<(i - lsbMsbTransitionBit - 1)));
#endif
for(int k=0; k < (1<<(i - lsbMsbTransitionBit - 1)); k++)
{
link_dma_desc(&dmadesc_a[current_dmadescriptor_offset], previous_dmadesc_a, dma_buff.rowBits[row]->getDataPtr(i, 0), dma_buff.rowBits[row]->size(PIXEL_COLOR_DEPTH_BITS - i) );
previous_dmadesc_a = &dmadesc_a[current_dmadescriptor_offset];
// link_dma_desc(&dmadesc_a[current_dmadescriptor_offset], previous_dmadesc_a, dma_buff.rowBits[row]->getDataPtr(i, 0), dma_buff.rowBits[row]->size(PIXEL_COLOR_DEPTH_BITS - i) );
// previous_dmadesc_a = &dmadesc_a[current_dmadescriptor_offset];
dma_bus.create_dma_desc_link(dma_buff.rowBits[row]->getDataPtr(i, 0), dma_buff.rowBits[row]->size(PIXEL_COLOR_DEPTH_BITS - i) );
if (m_cfg.double_buff) {
link_dma_desc(&dmadesc_b[current_dmadescriptor_offset], previous_dmadesc_b, dma_buff.rowBits[row]->getDataPtr(i, 1), dma_buff.rowBits[row]->size(PIXEL_COLOR_DEPTH_BITS - i) );
previous_dmadesc_b = &dmadesc_b[current_dmadescriptor_offset];
dma_bus.create_dma_desc_link(dma_buff.rowBits[row]->getDataPtr(i, 0), dma_buff.rowBits[row]->size(PIXEL_COLOR_DEPTH_BITS - i), true );
//link_dma_desc(&dmadesc_b[current_dmadescriptor_offset], previous_dmadesc_b, dma_buff.rowBits[row]->getDataPtr(i, 1), dma_buff.rowBits[row]->size(PIXEL_COLOR_DEPTH_BITS - i) );
//previous_dmadesc_b = &dmadesc_b[current_dmadescriptor_offset];
}
current_dmadescriptor_offset++;
@ -394,7 +211,7 @@ void MatrixPanel_I2S_DMA::configureDMA(const HUB75_I2S_CFG& _cfg)
} // end color depth loop
} // end frame rows
/*
#if SERIAL_DEBUG
Serial.printf_P(PSTR("configureDMA(): Configured LL structure. %d DMA Linked List descriptors populated.\r\n"), current_dmadescriptor_offset);
@ -433,12 +250,45 @@ void MatrixPanel_I2S_DMA::configureDMA(const HUB75_I2S_CFG& _cfg)
};
// Setup I2S
i2s_parallel_driver_install(ESP32_I2S_DEVICE, &dma_cfg);
i2s_parallel_send_dma(ESP32_I2S_DEVICE, &dmadesc_a[0]);
//i2s_parallel_driver_install(ESP32_I2S_DEVICE, &dma_cfg);
#if SERIAL_DEBUG
Serial.println(F("configureDMA(): DMA setup completed on ESP32_I2S_DEVICE."));
#endif
*/
//
// Setup DMA and Output to GPIO
//
auto bus_cfg = dma_bus.config(); // バス設定用の構造体を取得します。
//bus_cfg.i2s_port = I2S_NUM_0; // 使用するI2Sポートを選択 (I2S_NUM_0 or I2S_NUM_1) (ESP32のI2S LCDモードを使用します)
bus_cfg.bus_freq = _cfg.i2sspeed;
bus_cfg.pin_wr = m_cfg.gpio.clk; // WR を接続しているピン番号
bus_cfg.pin_d0 = m_cfg.gpio.r1;
bus_cfg.pin_d1 = m_cfg.gpio.g1;
bus_cfg.pin_d2 = m_cfg.gpio.b1;
bus_cfg.pin_d3 = m_cfg.gpio.r2;
bus_cfg.pin_d4 = m_cfg.gpio.g2;
bus_cfg.pin_d5 = m_cfg.gpio.b2;
bus_cfg.pin_d6 = m_cfg.gpio.lat;
bus_cfg.pin_d7 = m_cfg.gpio.oe;
bus_cfg.pin_d8 = m_cfg.gpio.a;
bus_cfg.pin_d9 = m_cfg.gpio.b;
bus_cfg.pin_d10 = m_cfg.gpio.c;
bus_cfg.pin_d11 = m_cfg.gpio.d;
bus_cfg.pin_d12 = m_cfg.gpio.e;
bus_cfg.pin_d13 = -1;
bus_cfg.pin_d14 = -1;
bus_cfg.pin_d15 = -1;
dma_bus.config(bus_cfg);
dma_bus.init();
dma_bus.dma_transfer_start();
//i2s_parallel_send_dma(ESP32_I2S_DEVICE, &dmadesc_a[0]);
ESP_LOGI(TAG, "DMA setup completed");
} // end initMatrixDMABuff
@ -499,7 +349,7 @@ void IRAM_ATTR MatrixPanel_I2S_DMA::updateMatrixDMABuffer(int16_t x_coord, int16
* data.
*/
#ifndef ESP32_SXXX
#if defined (ESP32_THE_ORIG)
// We need to update the correct uint16_t in the rowBitStruct array, that gets sent out in parallel
// 16 bit parallel mode - Save the calculated value to the bitplane memory in reverse order to account for I2S Tx FIFO mode1 ordering
// Irrelevant for ESP32-S2 the way the FIFO ordering works is different - refer to page 679 of S2 technical reference manual
@ -539,7 +389,8 @@ void IRAM_ATTR MatrixPanel_I2S_DMA::updateMatrixDMABuffer(int16_t x_coord, int16
// Get the contents at this address,
// it would represent a vector pointing to the full row of pixels for the specified color depth bit at Y coordinate
ESP32_I2S_DMA_STORAGE_TYPE *p = getRowDataPtr(y_coord, color_depth_idx, back_buffer_id);
//ESP32_I2S_DMA_STORAGE_TYPE *p = getRowDataPtr(y_coord, color_depth_idx, back_buffer_id);
ESP32_I2S_DMA_STORAGE_TYPE *p = dma_buff.rowBits[y_coord]->getDataPtr(color_depth_idx, back_buffer_id);
// We need to update the correct uint16_t word in the rowBitStruct array pointing to a specific pixel at X - coordinate
@ -592,7 +443,8 @@ void MatrixPanel_I2S_DMA::updateMatrixDMABuffer(uint8_t red, uint8_t green, uint
--matrix_frame_parallel_row;
// The destination for the pixel row bitstream
ESP32_I2S_DMA_STORAGE_TYPE *p = getRowDataPtr(matrix_frame_parallel_row, color_depth_idx, back_buffer_id);
//ESP32_I2S_DMA_STORAGE_TYPE *p = getRowDataPtr(matrix_frame_parallel_row, color_depth_idx, back_buffer_id);
ESP32_I2S_DMA_STORAGE_TYPE *p = dma_buff.rowBits[matrix_frame_parallel_row]->getDataPtr(color_depth_idx, back_buffer_id);
// iterate pixels in a row
int x_coord=dma_buff.rowBits[matrix_frame_parallel_row]->width;
@ -688,14 +540,14 @@ void MatrixPanel_I2S_DMA::clearFrameBuffer(bool _buff_id){
// switch pointer to a row for a specific color index
row = dma_buff.rowBits[row_idx]->getDataPtr(coloridx, _buff_id);
#ifdef ESP32_SXXX
// -1 works better on ESP32-S2 ? Because bytes get sent out in order...
row[dma_buff.rowBits[row_idx]->width - 1] |= BIT_LAT; // -1 pixel to compensate array index starting at 0
#else
#if defined(ESP32_THE_ORIG)
// We need to update the correct uint16_t in the rowBitStruct array, that gets sent out in parallel
// 16 bit parallel mode - Save the calculated value to the bitplane memory in reverse order to account for I2S Tx FIFO mode1 ordering
// Irrelevant for ESP32-S2 the way the FIFO ordering works is different - refer to page 679 of S2 technical reference manual
row[dma_buff.rowBits[row_idx]->width - 2] |= BIT_LAT; // -2 in the DMA array is actually -1 when it's reordered by TX FIFO
#else
// -1 works better on ESP32-S2 ? Because bytes get sent out in order...
row[dma_buff.rowBits[row_idx]->width - 1] |= BIT_LAT; // -1 pixel to compensate array index starting at 0
#endif
// need to disable OE before/after latch to hide row transition
@ -704,11 +556,7 @@ void MatrixPanel_I2S_DMA::clearFrameBuffer(bool _buff_id){
do {
--_blank;
#ifdef ESP32_SXXX
row[0 + _blank] |= BIT_OE;
row[dma_buff.rowBits[row_idx]->width - _blank - 1 ] |= BIT_OE; // (LAT pulse is (width-2) -1 pixel to compensate array index starting at 0
#else
#if defined(ESP32_THE_ORIG)
// Original ESP32 WROOM FIFO Ordering Sucks
uint8_t _blank_row_tx_fifo_tmp = 0 + _blank;
(_blank_row_tx_fifo_tmp & 1U) ? --_blank_row_tx_fifo_tmp : ++_blank_row_tx_fifo_tmp;
@ -717,7 +565,9 @@ void MatrixPanel_I2S_DMA::clearFrameBuffer(bool _buff_id){
_blank_row_tx_fifo_tmp = dma_buff.rowBits[row_idx]->width - _blank - 1; // (LAT pulse is (width-2) -1 pixel to compensate array index starting at 0
(_blank_row_tx_fifo_tmp & 1U) ? --_blank_row_tx_fifo_tmp : ++_blank_row_tx_fifo_tmp;
row[_blank_row_tx_fifo_tmp] |= BIT_OE;
#else
row[0 + _blank] |= BIT_OE;
row[dma_buff.rowBits[row_idx]->width - _blank - 1 ] |= BIT_OE; // (LAT pulse is (width-2) -1 pixel to compensate array index starting at 0
#endif
} while (_blank);
@ -786,13 +636,13 @@ void MatrixPanel_I2S_DMA::brtCtrlOE(int brt, const bool _buff_id){
do {
--_blank;
#ifdef ESP32_SXXX
row[0 + _blank] |= BIT_OE;
#else
#if defined(ESP32_THE_ORIG)
// Original ESP32 WROOM FIFO Ordering Sucks
uint8_t _blank_row_tx_fifo_tmp = 0 + _blank;
(_blank_row_tx_fifo_tmp & 1U) ? --_blank_row_tx_fifo_tmp : ++_blank_row_tx_fifo_tmp;
row[_blank_row_tx_fifo_tmp] |= BIT_OE;
#else
row[0 + _blank] |= BIT_OE;
#endif
//row[0 + _blank] |= BIT_OE;
@ -915,12 +765,12 @@ void MatrixPanel_I2S_DMA::hlineDMA(int16_t x_coord, int16_t y_coord, int16_t l,
do { // iterate pixels in a row
int16_t _x = x_coord + --_l;
#ifdef ESP32_SXXX
// ESP 32 doesn't need byte flipping for TX FIFO.
uint16_t &v = p[_x];
#else
#if defined(ESP32_THE_ORIG)
// Save the calculated value to the bitplane memory in reverse order to account for I2S Tx FIFO mode1 ordering
uint16_t &v = p[_x & 1U ? --_x : ++_x];
#else
// ESP 32 doesn't need byte flipping for TX FIFO.
uint16_t &v = p[_x];
#endif
v &= _colorbitclear; // reset color bits
@ -956,7 +806,7 @@ void MatrixPanel_I2S_DMA::vlineDMA(int16_t x_coord, int16_t y_coord, int16_t l,
blue = lumConvTab[blue];
#endif
#ifndef ESP32_SXXX
#if defined(ESP32_THE_ORIG)
// Save the calculated value to the bitplane memory in reverse order to account for I2S Tx FIFO mode1 ordering
x_coord & 1U ? --x_coord : ++x_coord;
#endif
@ -994,7 +844,8 @@ void MatrixPanel_I2S_DMA::vlineDMA(int16_t x_coord, int16_t y_coord, int16_t l,
// Get the contents at this address,
// it would represent a vector pointing to the full row of pixels for the specified color depth bit at Y coordinate
ESP32_I2S_DMA_STORAGE_TYPE *p = getRowDataPtr(_y, color_depth_idx, back_buffer_id);
//ESP32_I2S_DMA_STORAGE_TYPE *p = getRowDataPtr(_y, color_depth_idx, back_buffer_id);
ESP32_I2S_DMA_STORAGE_TYPE *p = dma_buff.rowBits[_y]->getDataPtr(color_depth_idx, back_buffer_id);
p[x_coord] &= _colorbitclear; // reset RGB bits
p[x_coord] |= RGB_output_bits; // set new RGB bits

View file

@ -4,6 +4,8 @@
/* Core ESP32 hardware / idf includes! */
#include <vector>
#include <memory>
#include <Arduino.h>
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
@ -11,7 +13,8 @@
#include "freertos/queue.h"
#include "esp_heap_caps.h"
#include "esp32_i2s_parallel_dma.h"
#include "platforms/platform_detect.hpp"
#ifdef USE_GFX_ROOT
#include <FastLED.h>
@ -25,19 +28,11 @@
* Changing the values just here won't work - as defines needs to persist beyond the scope *
* of just this file. *
*******************************************************************************************/
/* Enable serial debugging of the library, to see how memory is allocated etc. */
//#define SERIAL_DEBUG 1
/* Do NOT build additional methods optimized for fast drawing,
* i.e. Adafruits drawFastHLine, drawFastVLine, etc... */
//#define NO_FAST_FUNCTIONS
// #define NO_FAST_FUNCTIONS
/* Use GFX_Root (https://github.com/mrfaptastic/GFX_Root) instead of Adafruit_GFX library.
* > Removes Bus_IO & Wire.h library dependencies.
* > Provides 24bpp (CRGB) colour support for Adafruit_GFX functions like drawCircle etc.
* > Requires FastLED.h
*/
//#define USE_GFX_ROOT 1
// #define NO_CIE1931
/* Physical / Chained HUB75(s) RGB pixel WIDTH and HEIGHT.
*
@ -62,48 +57,6 @@
#define CHAIN_LENGTH 1 // Number of modules chained together, i.e. 4 panels chained result in virtualmatrix 64x4=256 px long
#endif
/* ESP32 Default Pin definition. You can change this, but best if you keep it as is and provide custom pin mappings
* as part of the begin(...) function.
*/
// Default pin mapping for ESP32-S2 and ESP32-S3
#ifdef ESP32_SXXX
#define R1_PIN_DEFAULT 45
#define G1_PIN_DEFAULT 42
#define B1_PIN_DEFAULT 41
#define R2_PIN_DEFAULT 40
#define G2_PIN_DEFAULT 39
#define B2_PIN_DEFAULT 38
#define A_PIN_DEFAULT 37
#define B_PIN_DEFAULT 36
#define C_PIN_DEFAULT 35
#define D_PIN_DEFAULT 34
#define E_PIN_DEFAULT -1 // required for 1/32 scan panels, like 64x64. Any available pin would do, i.e. IO32
#define LAT_PIN_DEFAULT 26
#define OE_PIN_DEFAULT 21
#define CLK_PIN_DEFAULT 33
// Else use default pin mapping for ESP32 Original WROOM module.
#else
#define R1_PIN_DEFAULT 25
#define G1_PIN_DEFAULT 26
#define B1_PIN_DEFAULT 27
#define R2_PIN_DEFAULT 14
#define G2_PIN_DEFAULT 12
#define B2_PIN_DEFAULT 13
#define A_PIN_DEFAULT 23
#define B_PIN_DEFAULT 19
#define C_PIN_DEFAULT 5
#define D_PIN_DEFAULT 17
#define E_PIN_DEFAULT -1 // IMPORTANT: Change to a valid pin if using a 64x64px panel.
#define LAT_PIN_DEFAULT 4
#define OE_PIN_DEFAULT 15
#define CLK_PIN_DEFAULT 16
#endif
// Interesting Fact: We end up using a uint16_t to send data in parallel to the HUB75... but
// given we only map to 14 physical output wires/bits, we waste 2 bits.
@ -124,12 +77,9 @@
#define COLOR_CHANNELS_PER_PIXEL 3
// #define NO_CIE1931
/***************************************************************************************/
/* Definitions below should NOT be ever changed without rewriting library logic */
#define ESP32_I2S_DMA_MODE I2S_PARALLEL_WIDTH_16 // From esp32_i2s_parallel_v2.h = 16 bits in parallel
#define ESP32_I2S_DMA_MODE 16 // From esp32_i2s_parallel_v2.h = 16 bits in parallel
#define ESP32_I2S_DMA_STORAGE_TYPE uint16_t // DMA output of one uint16_t at a time.
#define CLKS_DURING_LATCH 0 // Not (yet) used.
@ -212,11 +162,18 @@ struct rowBitStruct {
* NOTE: this call might be very slow in loops. Due to poor instruction caching in esp32 it might be required a reread from flash
* every loop cycle, better use inlined #define instead in such cases
*/
ESP32_I2S_DMA_STORAGE_TYPE* getDataPtr(const uint8_t _dpth=0, const bool buff_id=0) { return &(data[_dpth*width + buff_id*(width*color_depth)]); };
inline ESP32_I2S_DMA_STORAGE_TYPE* getDataPtr(const uint8_t _dpth=0, const bool buff_id=0) { return &(data[_dpth*width + buff_id*(width*color_depth)]); };
// constructor - allocates DMA-capable memory to hold the struct data
rowBitStruct(const size_t _width, const uint8_t _depth, const bool _dbuff) : width(_width), color_depth(_depth), double_buff(_dbuff) {
#if defined(SPIRAM_FRAMEBUFFER)
#pragma message "Enabling PSRAM / SPIRAM for frame buffer."
data = (ESP32_I2S_DMA_STORAGE_TYPE *)heap_caps_malloc( size()+size()*double_buff, MALLOC_CAP_SPIRAM | MALLOC_CAP_8BIT);
#else
data = (ESP32_I2S_DMA_STORAGE_TYPE *)heap_caps_malloc( size()+size()*double_buff, MALLOC_CAP_DMA);
#endif
}
~rowBitStruct() { delete data;}
};
@ -426,13 +383,9 @@ class MatrixPanel_I2S_DMA {
// Obj destructor
~MatrixPanel_I2S_DMA(){
stopDMAoutput();
delete dmadesc_a;
if (m_cfg.double_buff)
delete dmadesc_b;
dma_bus.release();
}
@ -525,6 +478,9 @@ class MatrixPanel_I2S_DMA {
Serial.printf_P(PSTR("Set back buffer to: %d\n"), back_buffer_id);
#endif
dma_bus.flip_dma_output_buffer();
/*
i2s_parallel_set_previous_buffer_not_free();
// Wait before we allow any writing to the buffer. Stop flicker.
while(i2s_parallel_is_previous_buffer_free() == false) { }
@ -537,7 +493,9 @@ class MatrixPanel_I2S_DMA {
i2s_parallel_set_previous_buffer_not_free();
// Wait before we allow any writing to the buffer. Stop flicker.
while(i2s_parallel_is_previous_buffer_free() == false) { }
*/
back_buffer_id ^= 1;
@ -593,7 +551,8 @@ class MatrixPanel_I2S_DMA {
*/
void stopDMAoutput() {
resetbuffers();
i2s_parallel_stop_dma(ESP32_I2S_DEVICE);
//i2s_parallel_stop_dma(ESP32_I2S_DEVICE);
dma_bus.dma_transfer_stop();
}
@ -602,6 +561,8 @@ class MatrixPanel_I2S_DMA {
// those might be useful for child classes, like VirtualMatrixPanel
protected:
Bus_Parallel16 dma_bus;
/**
* @brief - clears and reinitializes color/control data in DMA buffs
* When allocated, DMA buffs might be dirty, so we need to blank it and initialize ABCDE,LAT,OE control bits.
@ -685,8 +646,8 @@ class MatrixPanel_I2S_DMA {
// ESP 32 DMA Linked List descriptor
int desccount = 0;
lldesc_t * dmadesc_a = {0};
lldesc_t * dmadesc_b = {0};
// lldesc_t * dmadesc_a = {0};
// lldesc_t * dmadesc_b = {0};
/* Pixel data is organized from LSB to MSB sequentially by row, from row 0 to row matrixHeight/matrixRowsInParallel
* (two rows of pixels are refreshed in parallel)
@ -818,3 +779,57 @@ inline void MatrixPanel_I2S_DMA::drawIcon (int *ico, int16_t x, int16_t y, int16
#endif
// Credits: Louis Beaudoin <https://github.com/pixelmatix/SmartMatrix/tree/teensylc>
// and Sprite_TM: https://www.esp32.com/viewtopic.php?f=17&t=3188 and https://www.esp32.com/viewtopic.php?f=13&t=3256
/*
This is example code to driver a p3(2121)64*32 -style RGB LED display. These types of displays do not have memory and need to be refreshed
continuously. The display has 2 RGB inputs, 4 inputs to select the active line, a pixel clock input, a latch enable input and an output-enable
input. The display can be seen as 2 64x16 displays consisting of the upper half and the lower half of the display. Each half has a separate
RGB pixel input, the rest of the inputs are shared.
Each display half can only show one line of RGB pixels at a time: to do this, the RGB data for the line is input by setting the RGB input pins
to the desired value for the first pixel, giving the display a clock pulse, setting the RGB input pins to the desired value for the second pixel,
giving a clock pulse, etc. Do this 64 times to clock in an entire row. The pixels will not be displayed yet: until the latch input is made high,
the display will still send out the previously clocked in line. Pulsing the latch input high will replace the displayed data with the data just
clocked in.
The 4 line select inputs select where the currently active line is displayed: when provided with a binary number (0-15), the latched pixel data
will immediately appear on this line. Note: While clocking in data for a line, the *previous* line is still displayed, and these lines should
be set to the value to reflect the position the *previous* line is supposed to be on.
Finally, the screen has an OE input, which is used to disable the LEDs when latching new data and changing the state of the line select inputs:
doing so hides any artefacts that appear at this time. The OE line is also used to dim the display by only turning it on for a limited time every
line.
All in all, an image can be displayed by 'scanning' the display, say, 100 times per second. The slowness of the human eye hides the fact that
only one line is showed at a time, and the display looks like every pixel is driven at the same time.
Now, the RGB inputs for these types of displays are digital, meaning each red, green and blue subpixel can only be on or off. This leads to a
color palette of 8 pixels, not enough to display nice pictures. To get around this, we use binary code modulation.
Binary code modulation is somewhat like PWM, but easier to implement in our case. First, we define the time we would refresh the display without
binary code modulation as the 'frame time'. For, say, a four-bit binary code modulation, the frame time is divided into 15 ticks of equal length.
We also define 4 subframes (0 to 3), defining which LEDs are on and which LEDs are off during that subframe. (Subframes are the same as a
normal frame in non-binary-coded-modulation mode, but are showed faster.) From our (non-monochrome) input image, we take the (8-bit: bit 7
to bit 0) RGB pixel values. If the pixel values have bit 7 set, we turn the corresponding LED on in subframe 3. If they have bit 6 set,
we turn on the corresponding LED in subframe 2, if bit 5 is set subframe 1, if bit 4 is set in subframe 0.
Now, in order to (on average within a frame) turn a LED on for the time specified in the pixel value in the input data, we need to weigh the
subframes. We have 15 pixels: if we show subframe 3 for 8 of them, subframe 2 for 4 of them, subframe 1 for 2 of them and subframe 1 for 1 of
them, this 'automatically' happens. (We also distribute the subframes evenly over the ticks, which reduces flicker.)
In this code, we use the I2S peripheral in parallel mode to achieve this. Essentially, first we allocate memory for all subframes. This memory
contains a sequence of all the signals (2xRGB, line select, latch enable, output enable) that need to be sent to the display for that subframe.
Then we ask the I2S-parallel driver to set up a DMA chain so the subframes are sent out in a sequence that satisfies the requirement that
subframe x has to be sent out for (2^x) ticks. Finally, we fill the subframes with image data.
We use a front buffer/back buffer technique here to make sure the display is refreshed in one go and drawing artefacts do not reach the display.
In practice, for small displays this is not really necessarily.
*/

View file

@ -0,0 +1,18 @@
#pragma once
#define R1_PIN_DEFAULT 25
#define G1_PIN_DEFAULT 26
#define B1_PIN_DEFAULT 27
#define R2_PIN_DEFAULT 14
#define G2_PIN_DEFAULT 12
#define B2_PIN_DEFAULT 13
#define A_PIN_DEFAULT 23
#define B_PIN_DEFAULT 19
#define C_PIN_DEFAULT 5
#define D_PIN_DEFAULT 17
#define E_PIN_DEFAULT -1 // IMPORTANT: Change to a valid pin if using a 64x64px panel.
#define LAT_PIN_DEFAULT 4
#define OE_PIN_DEFAULT 15
#define CLK_PIN_DEFAULT 16

View file

@ -0,0 +1,571 @@
/*----------------------------------------------------------------------------/
Lovyan GFX - Graphics library for embedded devices.
Original Source:
https://github.com/lovyan03/LovyanGFX/
Licence:
[FreeBSD](https://github.com/lovyan03/LovyanGFX/blob/master/license.txt)
Author:
[lovyan03](https://twitter.com/lovyan03)
Contributors:
[ciniml](https://github.com/ciniml)
[mongonta0716](https://github.com/mongonta0716)
[tobozo](https://github.com/tobozo)
Modified heavily for the ESP32 HUB75 DMA library by:
[mrfaptastic](https://github.com/mrfaptastic)
/----------------------------------------------------------------------------*/
static const char* TAG = "esp32_i2s_parallel_dma";
#include <sdkconfig.h>
#if defined (CONFIG_IDF_TARGET_ESP32)
#include "esp32_i2s_parallel_dma.hpp"
#include <driver/gpio.h>
#include <driver/periph_ctrl.h>
#include <soc/gpio_sig_map.h>
#include <esp_err.h>
#include <esp_log.h>
/*
callback shiftCompleteCallback;
void setShiftCompleteCallback(callback f) {
shiftCompleteCallback = f;
}
volatile int previousBufferOutputLoopCount = 0;
volatile bool previousBufferFree = true;
static void IRAM_ATTR irq_hndlr(void* arg) { // if we use I2S1 (default)
SET_PERI_REG_BITS(I2S_INT_CLR_REG(ESP32_I2S_DEVICE), I2S_OUT_EOF_INT_CLR_V, 1, I2S_OUT_EOF_INT_CLR_S);
previousBufferFree = true;
} // end irq_hndlr
*/
static i2s_dev_t* getDev(int port)
{
#if defined (CONFIG_IDF_TARGET_ESP32S2)
return &I2S0;
#else
return (port == 0) ? &I2S0 : &I2S1;
#endif
}
void Bus_Parallel16::config(const config_t& cfg)
{
ESP_LOGI(TAG, "Performing config for ESP32 or ESP32-S2");
_cfg = cfg;
auto port = cfg.port;
_dev = getDev(port);
}
//#if defined (CONFIG_IDF_TARGET_ESP32S2)
static void _gpio_pin_init(int pin)
{
if (pin >= 0)
{
gpio_pad_select_gpio(pin);
//gpio_hi(pin);
gpio_set_direction((gpio_num_t)pin, GPIO_MODE_OUTPUT);
gpio_set_drive_capability((gpio_num_t)pin, (gpio_drive_cap_t)3); // esp32s3 as well?
}
}
bool Bus_Parallel16::init(void) // The big one that gets everything setup.
{
ESP_LOGI(TAG, "Performing DMA bus init() for ESP32 or ESP32-S2");
if(_cfg.port < I2S_NUM_0 || _cfg.port >= I2S_NUM_MAX) {
//return ESP_ERR_INVALID_ARG;
return false;
}
if(_cfg.parallel_width < 8 || _cfg.parallel_width >= 24) {
return false;
}
//auto freq = (_cfg.freq_write, 50000000u); // ?
auto freq = (_cfg.bus_freq);
uint32_t _clkdiv_write = 0;
size_t _div_num = 10;
// Calculate clock divider for ESP32-S2
#if defined (CONFIG_IDF_TARGET_ESP32S2)
static constexpr uint32_t pll_160M_clock_d2 = 160 * 1000 * 1000 >> 1;
// I2S_CLKM_DIV_NUM 2=40MHz / 3=27MHz / 4=20MHz / 5=16MHz / 8=10MHz / 10=8MHz
_div_num = std::min(255u, 1 + ((pll_160M_clock_d2) / (1 + _cfg.freq_write)));
_clkdiv_write = I2S_CLK_160M_PLL << I2S_CLK_SEL_S
| I2S_CLK_EN
| 1 << I2S_CLKM_DIV_A_S
| 0 << I2S_CLKM_DIV_B_S
| _div_num << I2S_CLKM_DIV_NUM_S
;
#else
// clock = 80MHz(PLL_D2_CLK)
static constexpr uint32_t pll_d2_clock = 80 * 1000 * 1000;
// I2S_CLKM_DIV_NUM 4=20MHz / 5=16MHz / 8=10MHz / 10=8MHz
_div_num = std::min(255u, std::max(3u, 1 + (pll_d2_clock / (1 + freq))));
_clkdiv_write = I2S_CLK_EN
| 1 << I2S_CLKM_DIV_A_S
| 0 << I2S_CLKM_DIV_B_S
| _div_num << I2S_CLKM_DIV_NUM_S
;
#endif
if(_div_num < 2 || _div_num > 16) {
return false;
}
//ESP_LOGI(TAG, "i2s pll clk_div_main is: %d", _div_num);
auto dev = _dev;
volatile int iomux_signal_base;
volatile int iomux_clock;
int irq_source;
// Initialize I2S0 peripheral
if (_cfg.port == 0)
{
periph_module_reset(PERIPH_I2S0_MODULE);
periph_module_enable(PERIPH_I2S0_MODULE);
iomux_clock = I2S0O_WS_OUT_IDX;
irq_source = ETS_I2S0_INTR_SOURCE;
switch(_cfg.parallel_width) {
case 8:
case 16:
iomux_signal_base = I2S0O_DATA_OUT8_IDX;
break;
case 24:
iomux_signal_base = I2S0O_DATA_OUT0_IDX;
break;
default:
return ESP_ERR_INVALID_ARG;
}
}
#if !defined (CONFIG_IDF_TARGET_ESP32S2)
// Can't compile if I2S1 if it doesn't exist with that hardware's IDF....
else {
periph_module_reset(PERIPH_I2S1_MODULE);
periph_module_enable(PERIPH_I2S1_MODULE);
iomux_clock = I2S1O_WS_OUT_IDX;
irq_source = ETS_I2S1_INTR_SOURCE;
switch(_cfg.parallel_width) {
case 16:
iomux_signal_base = I2S1O_DATA_OUT8_IDX;
break;
case 8:
case 24:
iomux_signal_base = I2S1O_DATA_OUT0_IDX;
break;
default:
return ESP_ERR_INVALID_ARG;
}
}
#endif
// Setup GPIOs
int bus_width = _cfg.parallel_width;
// Clock output GPIO setup
_gpio_pin_init(_cfg.pin_rd); // not used
_gpio_pin_init(_cfg.pin_wr); // clock
_gpio_pin_init(_cfg.pin_rs); // not used
// Data output GPIO setup
int8_t* pins = _cfg.pin_data;
for(int i = 0; i < bus_width; i++)
_gpio_pin_init(pins[i]);
// Route clock signal to clock pin
gpio_matrix_out(_cfg.pin_wr, iomux_clock, _cfg.invert_pclk, 0); // inverst clock if required
for (size_t i = 0; i < bus_width; i++) {
if (pins[i] >= 0) {
gpio_matrix_out(pins[i], iomux_signal_base + i, false, false);
}
}
// Setup i2s clock
dev->sample_rate_conf.val = 0;
// Third stage config, width of data to be written to IO (I think this should always be the actual data width?)
dev->sample_rate_conf.rx_bits_mod = bus_width;
dev->sample_rate_conf.tx_bits_mod = bus_width;
dev->sample_rate_conf.rx_bck_div_num = 2;
dev->sample_rate_conf.tx_bck_div_num = 2;
// Clock configuration
// dev->clkm_conf.val=0; // Clear the clkm_conf struct
/*
#if defined (CONFIG_IDF_TARGET_ESP32S2)
dev->clkm_conf.clk_sel = 2; // esp32-s2 only
dev->clkm_conf.clk_en = 1;
#endif
#if !defined (CONFIG_IDF_TARGET_ESP32S2)
dev->clkm_conf.clka_en=0; // Use the 80mhz system clock (PLL_D2_CLK) when '0'
#endif
dev->clkm_conf.clkm_div_b=0; // Clock numerator
dev->clkm_conf.clkm_div_a=1; // Clock denominator
*/
// Note: clkm_div_num must only be set here AFTER clkm_div_b, clkm_div_a, etc. Or weird things happen!
// On original ESP32, max I2S DMA parallel speed is 20Mhz.
//dev->clkm_conf.clkm_div_num = 32;
dev->clkm_conf.val = _clkdiv_write;
// I2S conf2 reg
dev->conf2.val = 0;
dev->conf2.lcd_en = 1;
dev->conf2.lcd_tx_wrx2_en=0;
dev->conf2.lcd_tx_sdx2_en=0;
// I2S conf reg
dev->conf.val = 0;
#if defined (CONFIG_IDF_TARGET_ESP32S2)
dev->conf.tx_dma_equal=1; // esp32-s2 only
dev->conf.pre_req_en=1; // esp32-s2 only - enable I2S to prepare data earlier? wtf?
#endif
// Now start setting up DMA FIFO
dev->fifo_conf.val = 0;
dev->fifo_conf.rx_data_num = 32; // Thresholds.
dev->fifo_conf.tx_data_num = 32;
dev->fifo_conf.dscr_en = 1;
#if !defined (CONFIG_IDF_TARGET_ESP32S2)
// Enable "One datum will be written twice in LCD mode" - for some reason,
// if we don't do this in 8-bit mode, data is updated on half-clocks not clocks
if(_cfg.parallel_width == 8)
dev->conf2.lcd_tx_wrx2_en=1;
// Not really described for non-pcm modes, although datasheet states it should be set correctly even for LCD mode
// First stage config. Configures how data is loaded into fifo
if(_cfg.parallel_width == 24) {
// Mode 0, single 32-bit channel, linear 32 bit load to fifo
dev->fifo_conf.tx_fifo_mod = 3;
} else {
// Mode 1, single 16-bit channel, load 16 bit sample(*) into fifo and pad to 32 bit with zeros
// *Actually a 32 bit read where two samples are read at once. Length of fifo must thus still be word-aligned
dev->fifo_conf.tx_fifo_mod = 1;
}
// Dictated by ESP32 datasheet
dev->fifo_conf.rx_fifo_mod_force_en = 1;
dev->fifo_conf.tx_fifo_mod_force_en = 1;
// Second stage config
dev->conf_chan.val = 0;
// 16-bit single channel data
dev->conf_chan.tx_chan_mod = 1;
dev->conf_chan.rx_chan_mod = 1;
#endif
// Reset FIFO
dev->conf.rx_fifo_reset = 1;
#if defined (CONFIG_IDF_TARGET_ESP32S2)
while(dev->conf.rx_fifo_reset_st); // esp32-s2 only
#endif
dev->conf.rx_fifo_reset = 0;
dev->conf.tx_fifo_reset = 1;
#if defined (CONFIG_IDF_TARGET_ESP32S2)
while(dev->conf.tx_fifo_reset_st); // esp32-s2 only
#endif
dev->conf.tx_fifo_reset = 0;
// Reset DMA
dev->lc_conf.in_rst = 1;
dev->lc_conf.in_rst = 0;
dev->lc_conf.out_rst = 1;
dev->lc_conf.out_rst = 0;
dev->lc_conf.ahbm_rst = 1;
dev->lc_conf.ahbm_rst = 0;
dev->in_link.val = 0;
dev->out_link.val = 0;
// Device reset
dev->conf.rx_reset=1;
dev->conf.tx_reset=1;
dev->conf.rx_reset=0;
dev->conf.tx_reset=0;
dev->conf1.val = 0;
dev->conf1.tx_stop_en = 0;
/*
// Allocate I2S status structure for buffer swapping stuff
i2s_state = (i2s_parallel_state_t*) malloc(sizeof(i2s_parallel_state_t));
assert(i2s_state != NULL);
i2s_parallel_state_t *state = i2s_state;
state->desccount_a = conf->desccount_a;
state->desccount_b = conf->desccount_b;
state->dmadesc_a = conf->lldesc_a;
state->dmadesc_b = conf->lldesc_b;
state->i2s_interrupt_port_arg = port; // need to keep this somewhere in static memory for the ISR
*/
dev->timing.val = 0;
/*
// We using the double buffering switch logic?
if (conf->int_ena_out_eof)
{
// Get ISR setup
esp_err_t err = esp_intr_alloc(irq_source,
(int)(ESP_INTR_FLAG_IRAM | ESP_INTR_FLAG_LEVEL1),
irq_hndlr,
&state->i2s_interrupt_port_arg, NULL);
if(err) {
return err;
}
// Setup interrupt handler which is focussed only on the (page 322 of Tech. Ref. Manual)
// "I2S_OUT_EOF_INT: Triggered when rxlink has finished sending a packet"
// ... whatever the hell that is supposed to mean... One massive linked list? So all pixels in the chain?
dev->int_ena.out_eof = 1;
}
*/
#if defined (CONFIG_IDF_TARGET_ESP32S2)
ESP_LOGD(TAG, "init() GPIO and clock configuration set for ESP32-S2");
#else
ESP_LOGD(TAG, "init() GPIO and clock configuration set for ESP32");
#endif
return true;
}
void Bus_Parallel16::release(void)
{
if (_dmadesc_a)
{
heap_caps_free(_dmadesc_a);
_dmadesc_a = nullptr;
_dmadesc_count = 0;
}
if (_dmadesc_b)
{
heap_caps_free(_dmadesc_b);
_dmadesc_b = nullptr;
_dmadesc_count = 0;
}
}
void Bus_Parallel16::enable_double_dma_desc(void)
{
_double_dma_buffer = true;
}
// Need this to work for double buffers etc.
bool Bus_Parallel16::allocate_dma_desc_memory(size_t len)
{
if (_dmadesc_a) heap_caps_free(_dmadesc_a); // free all dma descrptios previously
_dmadesc_count = len;
ESP_LOGI(TAG, "Allocating memory for %d DMA descriptors.", len);
_dmadesc_a= (HUB75_DMA_DESCRIPTOR_T*)heap_caps_malloc(sizeof(HUB75_DMA_DESCRIPTOR_T) * len, MALLOC_CAP_DMA);
if (_dmadesc_a == nullptr)
{
ESP_LOGE(TAG, "ERROR: Couldn't malloc _dmadesc_a. Not enough memory.");
return false;
}
if (_double_dma_buffer)
{
if (_dmadesc_b) heap_caps_free(_dmadesc_b); // free all dma descrptios previously
ESP_LOGD(TAG, "Allocating the second buffer (double buffer enabled).");
_dmadesc_b= (HUB75_DMA_DESCRIPTOR_T*)heap_caps_malloc(sizeof(HUB75_DMA_DESCRIPTOR_T) * len, MALLOC_CAP_DMA);
if (_dmadesc_b == nullptr)
{
ESP_LOGE(TAG, "ERROR: Couldn't malloc _dmadesc_b. Not enough memory.");
_double_dma_buffer = false;
return false;
}
}
_dmadesc_a_idx = 0;
_dmadesc_b_idx = 0;
ESP_LOGD(TAG, "Allocating %d bytes of memory for DMA descriptors.", sizeof(HUB75_DMA_DESCRIPTOR_T) * len);
return true;
}
void Bus_Parallel16::create_dma_desc_link(void *data, size_t size, bool dmadesc_b)
{
static constexpr size_t MAX_DMA_LEN = (4096-4);
if (dmadesc_b)
ESP_LOGI(TAG, " * Double buffer descriptor.");
if (size > MAX_DMA_LEN)
{
size = MAX_DMA_LEN;
ESP_LOGW(TAG, "Creating DMA descriptor which links to payload with size greater than MAX_DMA_LEN!");
}
if ( (_dmadesc_a_idx+1) > _dmadesc_count) {
ESP_LOGE(TAG, "Attempted to create more DMA descriptors than allocated memory for. Expecting a maximum of %d DMA descriptors", _dmadesc_count);
return;
}
volatile lldesc_t *dmadesc;
volatile lldesc_t *next;
bool eof = false;
/*
dmadesc_a[desccount-1].eof = 1;
dmadesc_a[desccount-1].qe.stqe_next=(lldesc_t*)&dmadesc_a[0];
*/
// ESP_LOGI(TAG, "Creating descriptor %d\n", _dmadesc_a_idx);
if ( (dmadesc_b == true) ) // for primary buffer
{
dmadesc = &_dmadesc_b[_dmadesc_b_idx];
next = (_dmadesc_b_idx < (_dmadesc_count-1) ) ? &_dmadesc_b[_dmadesc_b_idx+1]:_dmadesc_b;
eof = (_dmadesc_b_idx == (_dmadesc_count-1));
}
else
{
dmadesc = &_dmadesc_a[_dmadesc_a_idx];
// https://stackoverflow.com/questions/47170740/c-negative-array-index
next = (_dmadesc_a_idx < (_dmadesc_count-1) ) ? _dmadesc_a + _dmadesc_a_idx+1:_dmadesc_a;
eof = (_dmadesc_a_idx == (_dmadesc_count-1));
}
if ( _dmadesc_a_idx == (_dmadesc_count-1) ) {
ESP_LOGW(TAG, "Creating final DMA descriptor and linking back to 0.");
}
dmadesc->size = size;
dmadesc->length = size;
dmadesc->buf = (uint8_t*) data;
dmadesc->eof = 0;
dmadesc->sosf = 0;
dmadesc->owner = 1;
dmadesc->qe.stqe_next = (lldesc_t*) next;
dmadesc->offset = 0;
if ( (dmadesc_b == true) ) { // for primary buffer
_dmadesc_b_idx++;
} else {
_dmadesc_a_idx++;
}
} // end create_dma_desc_link
void Bus_Parallel16::dma_transfer_start()
{
auto dev = _dev;
// Configure DMA burst mode
dev->lc_conf.val = I2S_OUT_DATA_BURST_EN | I2S_OUTDSCR_BURST_EN;
// Set address of DMA descriptor
dev->out_link.addr = (uint32_t) _dmadesc_a;
// Start DMA operation
dev->out_link.stop = 0;
dev->out_link.start = 1;
dev->conf.tx_start = 1;
} // end
void Bus_Parallel16::dma_transfer_stop()
{
auto dev = _dev;
// Stop all ongoing DMA operations
dev->out_link.stop = 1;
dev->out_link.start = 0;
dev->conf.tx_start = 0;
} // end
void Bus_Parallel16::flip_dma_output_buffer()
{
if ( _double_dma_buffer == false) return;
if ( _dmadesc_a_active == true) // change across to everything 'b''
{
_dmadesc_a[_dmadesc_count-1].qe.stqe_next = &_dmadesc_b[0];
_dmadesc_b[_dmadesc_count-1].qe.stqe_next = &_dmadesc_b[0];
}
else
{
_dmadesc_a[_dmadesc_count-1].qe.stqe_next = &_dmadesc_a[0];
_dmadesc_b[_dmadesc_count-1].qe.stqe_next = &_dmadesc_a[0];
}
} // end flip
#endif

View file

@ -0,0 +1,138 @@
/*----------------------------------------------------------------------------/
Lovyan GFX - Graphics library for embedded devices.
Original Source:
https://github.com/lovyan03/LovyanGFX/
Licence:
[FreeBSD](https://github.com/lovyan03/LovyanGFX/blob/master/license.txt)
Author:
[lovyan03](https://twitter.com/lovyan03)
Contributors:
[ciniml](https://github.com/ciniml)
[mongonta0716](https://github.com/mongonta0716)
[tobozo](https://github.com/tobozo)
Modified heavily for the ESP32 HUB75 DMA library by:
[mrfaptastic](https://github.com/mrfaptastic)
------------------------------------------------------------------------------
Putins Russia and its genocide in Ukraine is a disgrace to humanity.
https://www.reddit.com/r/ukraine/comments/xfuc6v/more_than_460_graves_have_already_been_found_in/
Xi Jinping and his communist Chinas silence on the war in Ukraine says everything about
how China condones such genocide, especially if it's against 'the west' (aka. decency).
Whilst the good people at Espressif probably have nothing to do with this, the unfortunate
reality is libraries like this increase the popularity of Chinese silicon chips, which
indirectly funds (through CCP state taxes) the growth and empowerment of such a despot government.
Global democracy, decency and security is put at risk with every Chinese silicon chip that is bought.
/----------------------------------------------------------------------------*/
#pragma once
#include <string.h> // memcpy
#include <algorithm>
#include <stdbool.h>
#include <sys/types.h>
#include <freertos/FreeRTOS.h>
#include <driver/i2s.h>
#include <rom/lldesc.h>
#include <rom/gpio.h>
#define DMA_MAX (4096-4)
// The type used for this SoC
#define HUB75_DMA_DESCRIPTOR_T lldesc_t
//----------------------------------------------------------------------------
class Bus_Parallel16
{
public:
Bus_Parallel16()
{
}
struct config_t
{
int port = 0;
// max 20MHz (when in 16 bit / 2 byte mode)
uint32_t bus_freq = 10000000;
int8_t pin_wr = -1; //
int8_t pin_rd = -1;
int8_t pin_rs = -1; // D/C
bool invert_pclk = false;
int8_t parallel_width = 16; // do not change
union
{
int8_t pin_data[16];
struct
{
int8_t pin_d0;
int8_t pin_d1;
int8_t pin_d2;
int8_t pin_d3;
int8_t pin_d4;
int8_t pin_d5;
int8_t pin_d6;
int8_t pin_d7;
int8_t pin_d8;
int8_t pin_d9;
int8_t pin_d10;
int8_t pin_d11;
int8_t pin_d12;
int8_t pin_d13;
int8_t pin_d14;
int8_t pin_d15;
};
};
};
const config_t& config(void) const { return _cfg; }
void config(const config_t& config);
bool init(void) ;
void release(void) ;
void enable_double_dma_desc();
bool allocate_dma_desc_memory(size_t len);
void create_dma_desc_link(void *memory, size_t size, bool dmadesc_b = false);
void dma_transfer_start();
void dma_transfer_stop();
void flip_dma_output_buffer();
private:
void _init_pins() { };
config_t _cfg;
bool _double_dma_buffer = false;
bool _dmadesc_a_active = true;
uint32_t _dmadesc_count = 0; // number of dma decriptors
uint32_t _dmadesc_a_idx = 0;
uint32_t _dmadesc_b_idx = 0;
HUB75_DMA_DESCRIPTOR_T* _dmadesc_a = nullptr;
HUB75_DMA_DESCRIPTOR_T* _dmadesc_b = nullptr;
volatile i2s_dev_t* _dev;
};

View file

@ -0,0 +1,16 @@
#pragma once
#define R1_PIN_DEFAULT 45
#define G1_PIN_DEFAULT 42
#define B1_PIN_DEFAULT 41
#define R2_PIN_DEFAULT 40
#define G2_PIN_DEFAULT 39
#define B2_PIN_DEFAULT 38
#define A_PIN_DEFAULT 37
#define B_PIN_DEFAULT 36
#define C_PIN_DEFAULT 35
#define D_PIN_DEFAULT 34
#define E_PIN_DEFAULT -1 // required for 1/32 scan panels, like 64x64. Any available pin would do, i.e. IO32
#define LAT_PIN_DEFAULT 26
#define OE_PIN_DEFAULT 21
#define CLK_PIN_DEFAULT 33

Binary file not shown.

After

Width:  |  Height:  |  Size: 486 KiB

View file

@ -0,0 +1,16 @@
https://docs.espressif.com/projects/esp-idf/en/latest/esp32s3/api-guides/external-ram.html
Restrictions
External RAM use has the following restrictions:
When flash cache is disabled (for example, if the flash is being written to), the external RAM also becomes inaccessible; any reads from or writes to it will lead to an illegal cache access exception. This is also the reason why ESP-IDF does not by default allocate any task stacks in external RAM (see below).
External RAM cannot be used as a place to store DMA transaction descriptors or as a buffer for a DMA transfer to read from or write into. Therefore when External RAM is enabled, any buffer that will be used in combination with DMA must be allocated using heap_caps_malloc(size, MALLOC_CAP_DMA | MALLOC_CAP_INTERNAL) and can be freed using a standard free() call.
*Note, although ESP32-S3 has hardware support for DMA to/from external RAM, this is not yet supported in ESP-IDF.*
External RAM uses the same cache region as the external flash. This means that frequently accessed variables in external RAM can be read and modified almost as quickly as in internal ram. However, when accessing large chunks of data (>32 KB), the cache can be insufficient, and speeds will fall back to the access speed of the external RAM. Moreover, accessing large chunks of data can “push out” cached flash, possibly making the execution of code slower afterwards.
In general, external RAM will not be used as task stack memory. xTaskCreate() and similar functions will always allocate internal memory for stack and task TCBs.

Binary file not shown.

After

Width:  |  Height:  |  Size: 92 KiB

View file

@ -0,0 +1,18 @@
#pragma once
// Avoid and QSPI pins
#define R1_PIN_DEFAULT 4
#define G1_PIN_DEFAULT 5
#define B1_PIN_DEFAULT 6
#define R2_PIN_DEFAULT 7
#define G2_PIN_DEFAULT 15
#define B2_PIN_DEFAULT 16
#define A_PIN_DEFAULT 18
#define B_PIN_DEFAULT 8
#define C_PIN_DEFAULT 3
#define D_PIN_DEFAULT 42
#define E_PIN_DEFAULT -1 // required for 1/32 scan panels, like 64x64. Any available pin would do, i.e. IO32
#define LAT_PIN_DEFAULT 40
#define OE_PIN_DEFAULT 2
#define CLK_PIN_DEFAULT 41

View file

@ -0,0 +1,356 @@
/*
Simple example of using the ESP32-S3's LCD peripheral for general-purpose
(non-LCD) parallel data output with DMA. Connect 8 LEDs (or logic analyzer),
cycles through a pattern among them at about 1 Hz.
This code is ONLY for the ESP32-S3, NOT the S2, C3 or original ESP32.
None of this is authoritative canon, just a lot of trial error w/datasheet
and register poking. Probably more robust ways of doing this still TBD.
FULL CREDIT goes to AdaFruit
https://blog.adafruit.com/2022/06/21/esp32uesday-more-s3-lcd-peripheral-hacking-with-code/
PLEASE SUPPORT THEM!
*/
#include <Arduino.h>
#include "gdma_lcd_parallel16.hpp"
static const char* TAG = "gdma_lcd_parallel16";
dma_descriptor_t desc; // DMA descriptor for testing
/*
uint8_t data[8][312]; // Transmit buffer (2496 bytes total)
uint16_t* dmabuff2;
*/
// End-of-DMA-transfer callback
static IRAM_ATTR bool dma_callback(gdma_channel_handle_t dma_chan,
gdma_event_data_t *event_data, void *user_data) {
// This DMA callback seems to trigger a moment before the last data has
// issued (buffering between DMA & LCD peripheral?), so pause a moment
// before stopping LCD data out. The ideal delay may depend on the LCD
// clock rate...this one was determined empirically by monitoring on a
// logic analyzer. YMMV.
esp_rom_delay_us(100);
// The LCD peripheral stops transmitting at the end of the DMA xfer, but
// clear the lcd_start flag anyway -- we poll it in loop() to decide when
// the transfer has finished, and the same flag is set later to trigger
// the next transfer.
LCD_CAM.lcd_user.lcd_start = 0;
return true;
}
static lcd_cam_dev_t* getDev(int port)
{
return &LCD_CAM;
}
// ------------------------------------------------------------------------------
void Bus_Parallel16::config(const config_t& cfg)
{
_cfg = cfg;
auto port = cfg.port;
_dev = getDev(port);
}
//https://github.com/adafruit/Adafruit_Protomatter/blob/master/src/arch/esp32-s3.h
bool Bus_Parallel16::init(void)
{
///dmabuff2 = (uint16_t*)heap_caps_malloc(sizeof(uint16_t) * 64*32, MALLOC_CAP_SPIRAM | MALLOC_CAP_8BIT);
// LCD_CAM peripheral isn't enabled by default -- MUST begin with this:
periph_module_enable(PERIPH_LCD_CAM_MODULE);
periph_module_reset(PERIPH_LCD_CAM_MODULE);
// Reset LCD bus
LCD_CAM.lcd_user.lcd_reset = 1;
esp_rom_delay_us(100);
auto lcd_clkm_div_num = 160000000 / _cfg.bus_freq / 2;
ESP_LOGI(TAG, "Clock divider is %d", lcd_clkm_div_num);
// Configure LCD clock. Since this program generates human-perceptible
// output and not data for LED matrices or NeoPixels, use almost the
// slowest LCD clock rate possible. The S3-mini module used on Feather
// ESP32-S3 has a 40 MHz crystal. A 2-stage clock division of 1:16000
// is applied (250*64), yielding 2,500 Hz. Still much too fast for
// human eyes, so later we set up the data to repeat each output byte
// many times over.
LCD_CAM.lcd_clock.clk_en = 1; // Enable peripheral clock
LCD_CAM.lcd_clock.lcd_clk_sel = 2; // 160mhz source
LCD_CAM.lcd_clock.lcd_ck_out_edge = 0; // PCLK low in 1st half cycle
LCD_CAM.lcd_clock.lcd_ck_idle_edge = 0; // PCLK low idle
LCD_CAM.lcd_clock.lcd_clk_equ_sysclk = 0; // PCLK = CLK / (CLKCNT_N+1)
LCD_CAM.lcd_clock.lcd_clkm_div_num = lcd_clkm_div_num; // 1st stage 1:250 divide
LCD_CAM.lcd_clock.lcd_clkm_div_a = 1; // 0/1 fractional divide
LCD_CAM.lcd_clock.lcd_clkm_div_b = 0;
// See section 26.3.3.1 of the ESP32­S3 Technical Reference Manual
// for information on other clock sources and dividers.
// Configure LCD frame format. This is where we fiddle the peripheral
// to provide generic 8-bit output rather than actually driving an LCD.
// There's also a 16-bit mode but that's not shown here.
LCD_CAM.lcd_ctrl.lcd_rgb_mode_en = 0; // i8080 mode (not RGB)
LCD_CAM.lcd_rgb_yuv.lcd_conv_bypass = 0; // Disable RGB/YUV converter
LCD_CAM.lcd_misc.lcd_next_frame_en = 0; // Do NOT auto-frame
LCD_CAM.lcd_data_dout_mode.val = 0; // No data delays
LCD_CAM.lcd_user.lcd_always_out_en = 1; // Enable 'always out' mode
LCD_CAM.lcd_user.lcd_8bits_order = 0; // Do not swap bytes
LCD_CAM.lcd_user.lcd_bit_order = 0; // Do not reverse bit order
LCD_CAM.lcd_user.lcd_2byte_en = 1; // 8-bit data mode
LCD_CAM.lcd_user.lcd_dummy = 0; // Dummy phase(s) @ LCD start
LCD_CAM.lcd_user.lcd_dummy_cyclelen = 0; // 1 dummy phase
LCD_CAM.lcd_user.lcd_cmd = 0; // No command at LCD start
// "Dummy phases" are initial LCD peripheral clock cycles before data
// begins transmitting when requested. After much testing, determined
// that at least one dummy phase MUST be enabled for DMA to trigger
// reliably. A problem with dummy phase(s) is if we're also using the
// LCD_PCLK_IDX signal (not used in this code, but Adafruit_Protomatter
// does)...the clock signal will start a couple of pulses before data,
// which may or may not be problematic in some situations. You can
// disable the dummy phase but need to keep the LCD TX FIFO primed
// in that case, which gets complex.
// always_out_en is set above to allow aribtrary-length transfers,
// else lcd_dout_cyclelen is used...but is limited to 8K. Long (>4K)
// transfers need DMA linked lists, not used here but mentioned later.
// Route 8 LCD data signals to GPIO pins
int8_t* pins = _cfg.pin_data;
for(int i = 0; i < 16; i++)
{
if (pins[i] >= 0) { // -1 value will CRASH the ESP32!
esp_rom_gpio_connect_out_signal(pins[i], LCD_DATA_OUT0_IDX + i, false, false);
gpio_hal_iomux_func_sel(GPIO_PIN_MUX_REG[pins[i]], PIN_FUNC_GPIO);
gpio_set_drive_capability((gpio_num_t)pins[i], (gpio_drive_cap_t)3);
}
}
/*
const struct {
int8_t pin;
uint8_t signal;
} mux[] = {
{ 43, LCD_DATA_OUT0_IDX }, // These are 8 consecutive pins down one
{ 42, LCD_DATA_OUT1_IDX }, // side of the ESP32-S3 Feather. The ESP32
{ 2, LCD_DATA_OUT2_IDX }, // has super flexible pin MUX capabilities,
{ 9, LCD_DATA_OUT3_IDX }, // so any signal can go to any pin!
{ 10, LCD_DATA_OUT4_IDX },
{ 11, LCD_DATA_OUT5_IDX },
{ 12, LCD_DATA_OUT6_IDX },
{ 13, LCD_DATA_OUT7_IDX },
};
for (int i = 0; i < 8; i++) {
esp_rom_gpio_connect_out_signal(mux[i].pin, LCD_DATA_OUT0_IDX + i, false, false);
gpio_hal_iomux_func_sel(GPIO_PIN_MUX_REG[mux[i].pin], PIN_FUNC_GPIO);
gpio_set_drive_capability((gpio_num_t)mux[i].pin, (gpio_drive_cap_t)3);
}
*/
// Clock
esp_rom_gpio_connect_out_signal(_cfg.pin_wr, LCD_PCLK_IDX, _cfg.invert_pclk, false);
gpio_hal_iomux_func_sel(GPIO_PIN_MUX_REG[_cfg.pin_wr], PIN_FUNC_GPIO);
gpio_set_drive_capability((gpio_num_t)_cfg.pin_wr, (gpio_drive_cap_t)3);
// This program has a known fixed-size data buffer (2496 bytes) that fits
// in a single DMA descriptor (max 4095 bytes). Large transfers would
// require a linked list of descriptors, but here it's just one...
desc.dw0.owner = DMA_DESCRIPTOR_BUFFER_OWNER_DMA;
desc.dw0.suc_eof = 0; // Last descriptor
desc.next = &desc; // No linked list
// Remaining descriptor elements are initialized before each DMA transfer.
// Allocate DMA channel and connect it to the LCD peripheral
static gdma_channel_alloc_config_t dma_chan_config = {
.sibling_chan = NULL,
.direction = GDMA_CHANNEL_DIRECTION_TX,
.flags = {
.reserve_sibling = 0
}
};
gdma_new_channel(&dma_chan_config, &dma_chan);
gdma_connect(dma_chan, GDMA_MAKE_TRIGGER(GDMA_TRIG_PERIPH_LCD, 0));
static gdma_strategy_config_t strategy_config = {
.owner_check = false,
.auto_update_desc = false
};
gdma_apply_strategy(dma_chan, &strategy_config);
// Enable DMA transfer callback
static gdma_tx_event_callbacks_t tx_cbs = {
.on_trans_eof = dma_callback
};
gdma_register_tx_event_callbacks(dma_chan, &tx_cbs, NULL);
// As mentioned earlier, the slowest clock we can get to the LCD
// peripheral is 40 MHz / 250 / 64 = 2500 Hz. To make an even slower
// bit pattern that's perceptible, we just repeat each value many
// times over. The pattern here just counts through each of 8 bits
// (each LED lights in sequence)...so to get this to repeat at about
// 1 Hz, each LED is lit for 2500/8 or 312 cycles, hence the
// data[8][312] declaration at the start of this code (it's not
// precisely 1 Hz because reality is messy, but sufficient for demo).
// In actual use, say controlling an LED matrix or NeoPixels, such
// shenanigans aren't necessary, as these operate at multiple MHz
// with much smaller clock dividers and can use 1 byte per datum.
/*
for (int i = 0; i < (sizeof(data) / sizeof(data[0])); i++) { // 0 to 7
for (int j = 0; j < sizeof(data[0]); j++) { // 0 to 311
data[i][j] = 1 << i;
}
}
*/
// This uses a busy loop to wait for each DMA transfer to complete...
// but the whole point of DMA is that one's code can do other work in
// the interim. The CPU is totally free while the transfer runs!
while (LCD_CAM.lcd_user.lcd_start); // Wait for DMA completion callback
// After much experimentation, each of these steps is required to get
// a clean start on the next LCD transfer:
gdma_reset(dma_chan); // Reset DMA to known state
LCD_CAM.lcd_user.lcd_dout = 1; // Enable data out
LCD_CAM.lcd_user.lcd_update = 1; // Update registers
LCD_CAM.lcd_misc.lcd_afifo_reset = 1; // Reset LCD TX FIFO
// This program happens to send the same data over and over...but,
// if desired, one could fill the data buffer with a new bit pattern
// here, or point to a completely different buffer each time through.
// With two buffers, one can make best use of time by filling each
// with new data before the busy loop above, alternating between them.
// Reset elements of DMA descriptor. Just one in this code, long
// transfers would loop through a linked list.
/*
desc.dw0.size = desc.dw0.length = sizeof(data);
desc.buffer = dmabuff2; //data;
desc.next = &desc;
*/
/*
//gdma_start(dma_chan, (intptr_t)&desc); // Start DMA w/updated descriptor(s)
gdma_start(dma_chan, (intptr_t)&_dmadesc_a[0]); // Start DMA w/updated descriptor(s)
esp_rom_delay_us(100); // Must 'bake' a moment before...
LCD_CAM.lcd_user.lcd_start = 1; // Trigger LCD DMA transfer
*/
return true; // no return val = illegal instruction
}
void Bus_Parallel16::release(void)
{
if (_i80_bus)
{
esp_lcd_del_i80_bus(_i80_bus);
}
if (_dmadesc_a)
{
heap_caps_free(_dmadesc_a);
_dmadesc_a = nullptr;
_dmadesc_count = 0;
}
}
void Bus_Parallel16::enable_double_dma_desc(void)
{
//_double_dma_buffer = true;
}
// Need this to work for double buffers etc.
bool Bus_Parallel16::allocate_dma_desc_memory(size_t len)
{
if (_dmadesc_a) heap_caps_free(_dmadesc_a); // free all dma descrptios previously
_dmadesc_count = len;
ESP_LOGD(TAG, "Allocating %d bytes memory for DMA descriptors.", sizeof(HUB75_DMA_DESCRIPTOR_T) * len);
_dmadesc_a= (HUB75_DMA_DESCRIPTOR_T*)heap_caps_malloc(sizeof(HUB75_DMA_DESCRIPTOR_T) * len, MALLOC_CAP_DMA);
if (_dmadesc_a == nullptr)
{
ESP_LOGE(TAG, "ERROR: Couldn't malloc _dmadesc_a. Not enough memory.");
return false;
}
_dmadesc_a_idx = 0;
// _dmadesc_b_idx = 0;
return true;
}
void Bus_Parallel16::create_dma_desc_link(void *data, size_t size, bool dmadesc_b)
{
static constexpr size_t MAX_DMA_LEN = (4096-4);
if (size > MAX_DMA_LEN)
{
size = MAX_DMA_LEN;
ESP_LOGW(TAG, "Creating DMA descriptor which links to payload with size greater than MAX_DMA_LEN!");
}
if ( _dmadesc_a_idx >= _dmadesc_count)
{
ESP_LOGE(TAG, "Attempted to create more DMA descriptors than allocated memory for. Expecting a maximum of %d DMA descriptors", _dmadesc_count);
return;
}
_dmadesc_a[_dmadesc_a_idx].dw0.owner = DMA_DESCRIPTOR_BUFFER_OWNER_DMA;
_dmadesc_a[_dmadesc_a_idx].dw0.suc_eof = 0;
_dmadesc_a[_dmadesc_a_idx].dw0.size = _dmadesc_a[_dmadesc_a_idx].dw0.length = size; //sizeof(data);
_dmadesc_a[_dmadesc_a_idx].buffer = data; //data;
if (_dmadesc_a_idx == _dmadesc_count-1) {
_dmadesc_a[_dmadesc_a_idx].next = (dma_descriptor_t *) &_dmadesc_a[0];
}
else {
_dmadesc_a[_dmadesc_a_idx].next = (dma_descriptor_t *) &_dmadesc_a[_dmadesc_a_idx+1];
}
_dmadesc_a_idx++;
} // end create_dma_desc_link
void Bus_Parallel16::dma_transfer_start()
{
gdma_start(dma_chan, (intptr_t)&_dmadesc_a[0]); // Start DMA w/updated descriptor(s)
esp_rom_delay_us(100); // Must 'bake' a moment before...
LCD_CAM.lcd_user.lcd_start = 1; // Trigger LCD DMA transfer
} // end
void Bus_Parallel16::dma_transfer_stop()
{
LCD_CAM.lcd_user.lcd_reset = 1; // Trigger LCD DMA transfer
LCD_CAM.lcd_user.lcd_update = 1; // Trigger LCD DMA transfer
gdma_stop(dma_chan);
} // end
void Bus_Parallel16::flip_dma_output_buffer()
{
} // end flip

View file

@ -0,0 +1,176 @@
/*
Simple example of using the ESP32-S3's LCD peripheral for general-purpose
(non-LCD) parallel data output with DMA. Connect 8 LEDs (or logic analyzer),
cycles through a pattern among them at about 1 Hz.
This code is ONLY for the ESP32-S3, NOT the S2, C3 or original ESP32.
None of this is authoritative canon, just a lot of trial error w/datasheet
and register poking. Probably more robust ways of doing this still TBD.
FULL CREDIT goes to AdaFruit
https://blog.adafruit.com/2022/06/21/esp32uesday-more-s3-lcd-peripheral-hacking-with-code/
PLEASE SUPPORT THEM!
Putins Russia and its genocide in Ukraine is a disgrace to humanity.
https://www.reddit.com/r/ukraine/comments/xfuc6v/more_than_460_graves_have_already_been_found_in/
/----------------------------------------------------------------------------
*/
#pragma once
#if __has_include (<esp_lcd_panel_io.h>)
#include <sdkconfig.h>
#include <esp_lcd_panel_io.h>
//#include <freertos/portmacro.h>
#include <esp_intr_alloc.h>
#include <esp_err.h>
#include <esp_log.h>
#include <driver/gpio.h>
#include <soc/gpio_sig_map.h>
#include <hal/gpio_ll.h>
#include <hal/lcd_hal.h>
#include <string.h>
#include <math.h>
#include <stdint.h>
#if (ESP_IDF_VERSION_MAJOR == 5)
#include <esp_private/periph_ctrl.h>
#else
#include <driver/periph_ctrl.h>
#endif
#include <esp_private/gdma.h>
#include <esp_rom_gpio.h>
#include <hal/dma_types.h>
#include <hal/gpio_hal.h>
#include <hal/lcd_ll.h>
#include <soc/lcd_cam_reg.h>
#include <soc/lcd_cam_struct.h>
#include <esp_heap_caps.h>
#include <esp_heap_caps_init.h>
#if __has_include (<esp_private/periph_ctrl.h>)
#include <esp_private/periph_ctrl.h>
#else
#include <driver/periph_ctrl.h>
#endif
#if __has_include(<esp_arduino_version.h>)
#include <esp_arduino_version.h>
#endif
#define DMA_MAX (4096-4)
// The type used for this SoC
#define HUB75_DMA_DESCRIPTOR_T dma_descriptor_t
//----------------------------------------------------------------------------
class Bus_Parallel16
{
public:
Bus_Parallel16()
{
}
struct config_t
{
// LCD_CAM peripheral number. No need to change (only 0 for ESP32-S3.)
int port = 0;
// max 40MHz (when in 16 bit / 2 byte mode)
uint32_t bus_freq = 20000000;
int8_t pin_wr = -1;
int8_t pin_rd = -1;
int8_t pin_rs = -1; // D/C
bool invert_pclk = false;
union
{
int8_t pin_data[16];
struct
{
int8_t pin_d0;
int8_t pin_d1;
int8_t pin_d2;
int8_t pin_d3;
int8_t pin_d4;
int8_t pin_d5;
int8_t pin_d6;
int8_t pin_d7;
int8_t pin_d8;
int8_t pin_d9;
int8_t pin_d10;
int8_t pin_d11;
int8_t pin_d12;
int8_t pin_d13;
int8_t pin_d14;
int8_t pin_d15;
};
};
};
const config_t& config(void) const { return _cfg; }
void config(const config_t& config);
bool init(void) ;
void release(void) ;
void enable_double_dma_desc();
bool allocate_dma_desc_memory(size_t len);
void create_dma_desc_link(void *memory, size_t size, bool dmadesc_b = false);
void dma_transfer_start();
void dma_transfer_stop();
void flip_dma_output_buffer();
private:
config_t _cfg;
volatile lcd_cam_dev_t* _dev;
gdma_channel_handle_t dma_chan;
uint32_t _dmadesc_count = 0; // number of dma decriptors
uint32_t _dmadesc_a_idx = 0;
HUB75_DMA_DESCRIPTOR_T* _dmadesc_a = nullptr;
// uint32_t _clock_reg_value;
// uint32_t _fast_wait = 0;
//bool _double_dma_buffer = false;
//bool _dmadesc_a_active = true;
//uint32_t _dmadesc_b_idx = 0;
//HUB75_DMA_DESCRIPTOR_T* _dmadesc_b = nullptr;
esp_lcd_i80_bus_handle_t _i80_bus;
};
#endif

View file

@ -0,0 +1,57 @@
/*----------------------------------------------------------------------------/
Original Source:
https://github.com/lovyan03/LovyanGFX/
Licence:
[FreeBSD](https://github.com/lovyan03/LovyanGFX/blob/master/license.txt)
Author:
[lovyan03](https://twitter.com/lovyan03)
Contributors:
[ciniml](https://github.com/ciniml)
[mongonta0716](https://github.com/mongonta0716)
[tobozo](https://github.com/tobozo)
Modified heavily for the ESP32 HUB75 DMA library by:
[mrfaptastic](https://github.com/mrfaptastic)
/----------------------------------------------------------------------------*/
#pragma once
#if defined (ESP_PLATFORM)
#include <sdkconfig.h>
#if defined (CONFIG_IDF_TARGET_ESP32C3)
#error "ERROR: ESP32C3 not supported."
#elif defined (CONFIG_IDF_TARGET_ESP32S2)
#pragma message "Compiling for ESP32-S2"
#include "esp32/esp32_i2s_parallel_dma.hpp"
#include "esp32s2/esp32s2-default-pins.hpp"
#elif defined (CONFIG_IDF_TARGET_ESP32S3)
#pragma message "Compiling for ESP32-S3"
#include "esp32s3/gdma_lcd_parallel16.hpp"
#include "esp32s3/esp32s3-default-pins.hpp"
#else
// Assume an ESP32 (the original 2015 version)
// Same include as ESP32S3
#pragma message "Compiling for original ESP32 (2015 release)"
#define ESP32_THE_ORIG 1
//#include "esp32/esp32_i2s_parallel_dma.hpp"
//#include "esp32/esp32_i2s_parallel_dma.h"
#include "esp32/esp32_i2s_parallel_dma.hpp"
#include "esp32/esp32-default-pins.hpp"
#endif
#endif